| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
on Windows. This is yet another manifestation of the problem that Windows
returns time zone names that may be in a different encoding than we are using.
I've put a better solution in HEAD, but the back branches need a simple patch.
Per report from Hiroshi Saito.
|
|
|
|
|
|
|
|
|
|
| |
a well-randomized batch number even when given a poorly-randomized hash value.
This is a bit inefficient but seems the only practical solution given the
constraint that we can't change the hash functions in released branches.
Per report from Joseph Shraibman.
Applied to 8.1 and 8.2 only --- HEAD is getting a cleaner fix, and 8.0 and
before use different coding that seems less vulnerable.
|
|
|
|
|
|
| |
fail when used in a deferred trigger. Bug goes back to 8.0; no doubt the
reason it hadn't been noticed is that we've been discouraging use of
user-defined constraint triggers. Per report from Frank van Vugt.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
|
|
|
|
|
|
|
|
|
|
|
| |
failed (due to lock conflicts or out-of-space). We might have already
extended the index's filesystem EOF before failing, causing the EOF to be
beyond what the metapage says is the last used page. Hence the invariant
maintained by the code needs to be "EOF is at or beyond last used page",
not "EOF is exactly the last used page". Problem was created by my patch
of 2006-11-19 that attempted to repair bug #2737. Since that was
back-patched to 7.4, this needs to be as well. Per report and test case
from Vlastimil Krejcir.
|
|
|
|
|
| |
it was executed in. Someday it might be nice to allow cross-DB commits, but
work would be needed in NOTIFY and perhaps other places. Per Heikki.
|
|
|
|
|
|
|
|
|
|
|
|
| |
exactly at the point where we need to insert a new item, the calculation used
the wrong size for the "high key" of the new left page. This could lead to
choosing an unworkable split, resulting in "PANIC: failed to add item to the
left sibling" (or "right sibling") failure. Although this bug has been there
a long time, it's very difficult to trigger a failure before 8.2, since there
was generally a lot of free space on both sides of a chosen split. In 8.2,
where the user-selected fill factor determines how much free space the code
tries to leave, an unworkable split is much more likely. Report by Joe
Conway, diagnosis and fix by Heikki Linnakangas.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
preference for filling pages out-of-order tends to confuse the sanity checks
in md.c, as per report from Balazs Nagy in bug #2737. The fix is to ensure
that the smgr-level code always has the same idea of the logical EOF as the
hash index code does, by using ReadBuffer(P_NEW) where we are adding a single
page to the end of the index, and using smgrextend() to reserve a large batch
of pages when creating a new splitpoint. The patch is a bit ugly because it
avoids making any changes in md.c, which seems the most prudent approach for a
backpatchable beta-period fix. After 8.3 development opens, I'll take a look
at a cleaner but more invasive patch, in particular getting rid of the now
unnecessary hack to allow reading beyond EOF in mdread().
Backpatch as far as 7.4. The bug likely exists in 7.3 as well, but because
of the magnitude of the 7.3-to-7.4 changes in hash, the later-version patch
doesn't even begin to apply. Given the other known bugs in the 7.3-era hash
code, it does not seem worth trying to develop a separate patch for 7.3.
|
|
|
|
|
|
|
|
|
| |
cases where we already hold the desired lock "indirectly", either via
membership in a MultiXact or because the lock was originally taken by a
different subtransaction of the current transaction. These cases must be
accounted for to avoid needless deadlocks and/or inappropriate replacement of
an exclusive lock with a shared lock. Per report from Clarence Gardner and
subsequent investigation.
|
|
|
|
|
|
|
|
|
| |
_bt_pagedel to recover from the failure: just search the whole parent level
if searching to the right fails. This does nothing for the underlying problem
that index keys became out-of-order in the grandparent level. However, we
believe that there is no other consequence worse than slightly inefficient
searching, so this narrow patch seems like the safest solution for the back
branches.
|
|
|
|
|
|
|
|
|
|
|
| |
recovery. In the first place, it doesn't work because slru's
latest_page_number isn't set up yet (this is why we've been hearing reports
of strange "apparent wraparound" log messages during crash recovery, but
only from people who'd managed to advance their next-mxact counters some
considerable distance from 0). In the second place, it seems a bit unwise
to be throwing away data during crash recovery anwyway. This latter
consideration convinces me to just disable truncation during recovery,
rather than computing latest_page_number and pushing ahead.
|
|
|
|
|
| |
backup history file. Bug introduced by the 8.1 change to make pg_stop_backup
delete older history files. Per report from Masao Fujii.
|
|
|
|
|
|
|
|
|
|
| |
failures even when the hardware and OS did nothing wrong. Per recent analysis
of a problem report from Alex Bahdushka.
For the moment I've just diked out the test of the parameter, rather than
removing the GUC infrastructure and documentation, in case we conclude that
there's something salvageable there. There seems no chance of it being
resurrected in the 8.1 branch though.
|
|
|
|
|
|
|
|
|
|
| |
passed extend = true whenever we are reading a page we intend to reinitialize
completely, even if we think the page "should exist". This is because it
might indeed not exist, if the relation got truncated sometime after the
current xlog record was made and before the crash we're trying to recover
from. These two thinkos appear to explain both of the old bug reports
discussed here:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php
|
|
|
|
|
| |
with not responding to query cancel during the last stage of btree index
creation.
|
|
|
|
|
|
|
|
| |
we are not holding a buffer content lock; where it was, InterruptHoldoffCount
is positive and so we'd not respond to cancel signals as intended. Also
add missing vacuum_delay_point() call in btvacuumcleanup. This should fix
complaint from Evgeny Gridasov about failure to respond to SIGINT/SIGTERM
in a timely fashion (bug #2257).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
to try to create a log segment file concurrently, but the code erroneously
specified O_EXCL to open(), resulting in a needless failure. Before 7.4,
it was even a PANIC condition :-(. Correct code is actually simpler than
what we had, because we can just say O_CREAT to start with and not need a
second open() call. I believe this accounts for several recent reports of
hard-to-reproduce "could not create file ...: File exists" errors in both
pg_clog and pg_subtrans.
|
|
|
|
|
|
|
| |
discarded by cache flush while still in use. This is a minimal patch that
just copies the tupdesc anywhere it could be needed across a flush. Applied
to back branches only; Neil Conway is working on a better long-term solution
for HEAD.
|
|
|
|
|
|
| |
use it. While it normally has been opened earlier during btree index
build, testing shows that it's possible for the link to be closed again
if an sinval reset occurs while the index is being built.
|
|
|
|
|
| |
The consequences of overwriting a non-empty page are bad enough that
we should not omit this test in production builds.
|
|
|
|
| |
Back-patch of previous fix in HEAD for plperl-vs-locale issue.
|
|
|
|
|
|
|
|
|
|
|
|
| |
equal: if strcoll claims two strings are equal, check it with strcmp, and
sort according to strcmp if not identical. This fixes inconsistent
behavior under glibc's hu_HU locale, and probably under some other locales
as well. Also, take advantage of the now-well-defined behavior to speed up
texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise
comparison and not bother with strcoll at all.
NOTE: affected databases may need to REINDEX indexes on text columns to be
sure they are self-consistent.
|
|
|
|
|
|
|
|
|
| |
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
|
|
|
|
|
|
|
|
|
| |
tuple in-place, but instead passes back an all-new tuple structure if
any changes are needed. This is a much cleaner and more robust solution
for the bug discovered by Alexey Beschiokov; accordingly, revert the
quick hack I installed yesterday.
With this change, HeapTupleData.t_datamcxt is no longer needed; will
remove it in a separate commit in HEAD only.
|
|
|
|
|
| |
those names. (Debug and None were pretty bad names anyway.) I hope I catched
all uses of the names in comments too.
|
|
|
|
|
|
|
|
|
|
|
| |
very narrow window in which SimpleLruReadPage or SimpleLruWritePage could
think that I/O was needed when it wasn't (and indeed the buffer had already
been assigned to another page). This would result in an Assert failure if
Asserts were enabled, and probably in silent data corruption if not.
Reported independently by Jim Nasby and Robert Creager.
I intend a more extensive fix when 8.2 development starts, but this is a
reasonably low-impact patch for the existing branches.
|
| |
|
|
|
|
|
| |
reserving SLRU space for a new MultiXact. The original coding would have
treated out-of-disk-space as a PANIC condition, which is unnecessary.
|
|
|
|
|
|
|
|
|
|
| |
multixact's starting offset before the offset has been stored into the
SLRU file. A simple fix would be to hold the MultiXactGenLock until the
offset has been stored, but that looks like a big concurrency hit. Instead
rely on knowledge that unset offsets will be zero, and loop when we see
a zero. This requires a little extra hacking to ensure that zero is never
a valid value for the offset. Problem reported by Matteo Beccati, fix
ideas from Martijn van Oosterhout, Alvaro Herrera, and Tom Lane.
|
| |
|
|
|
|
|
|
| |
generated from subquery outputs: use the type info stored in the Var
itself. To avoid making ExecEvalVar and slot_getattr more complex
and slower, I split out the whole-row case into a separate ExecEval routine.
|
|
|
|
|
|
| |
type ID information even when it's a record type. This is needed to
handle whole-row Vars referencing subquery outputs. Per example from
Richard Huxton.
|
|
|
|
|
| |
by a recent HP C compiler. Mostly, get rid of useless local variables
that are assigned to but never used.
|
| |
|
|
|
|
|
|
|
| |
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".
Steve Woodcock
|
|
|
|
|
|
|
|
| |
This makes the error messages for PREPARE TRANSACTION, COMMIT PREPARED
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".
Steve Woodcock
|
|
|
|
|
|
|
| |
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".
Steve Woodcock
|
|
|
|
|
|
| |
the wrong buffer dirty when trying to kill a dead index entry that's on
a page after the one it started on. No risk of data corruption, just
inefficiency, but still a bug.
|
|
|
|
|
|
|
| |
generated by bitmap index scans. Along the way, simplify and speed up
the code for counting sequential and index scans; it was both confusing
and inefficient to be taking care of that in the per-tuple loops, IMHO.
initdb forced because of internal changes in pg_stat view definitions.
|
|
|
|
|
|
| |
was created on a machine with alignment rules and floating-point format
similar to the current machine. Per recent discussion, this seems like
a good idea with the increasing prevalence of 32/64 bit environments.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
on a page, as suggested by ITAGAKI Takahiro. Also, change a few places
that were using some other estimates of max-items-per-page to consistently
use MaxOffsetNumber. This is conservatively large --- we could have used
the new MaxHeapTuplesPerPage macro, or a similar one for index tuples ---
but those places are simply declaring a fixed-size buffer and assuming it
will work, rather than actively testing for overrun. It seems safer to
size these buffers in a way that can't overflow even if the page is
corrupt.
|
|
|
|
|
|
|
|
|
| |
saves nearly 700kB in the default shared memory segment size, which seems
worthwhile, and it is a feature that many users won't use anyway. Per
Heikki's argument, there is no point in a compromise value --- those who
are using 2PC at all will probably want it at least equal to max_connections.
But we can't set it to zero by default without breaking the prepared_xacts
regression test.
|
|
|
|
|
|
|
|
|
| |
after the fact. Fix bug with incorrect test for whether we are at end
of logfile segment. Arrange for writes triggered by XLogInsert's
is-cache-more-than-half-full test to synchronize with the cache boundaries,
so that in long transactions we tend to write alternating halves of the
cache rather than randomly chosen portions of it; this saves one more
write syscall per cache load.
|