| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Additional non-security issues/improvements spotted by Coverity.
In backend/libpq, no sense trying to protect against port->hba being
NULL after we've already dereferenced it in the switch() statement.
Prevent against possible overflow due to 32bit arithmitic in
basebackup throttling (not yet released, so no security concern).
Remove nonsensical check of array pointer against NULL in procarray.c,
looks to be a holdover from 9.1 and earlier when there were pointers
being used but now it's just an array.
Remove pointer check-against-NULL in tsearch/spell.c as we had already
dereferenced it above (in the strcmp()).
Remove dead code from adt/orderedsetaggs.c, isnull is checked
immediately after each tuplesort_getdatum() call and if true we return,
so no point checking it again down at the bottom.
Remove recently added minor error-condition memory leak in pg_regress.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A number of issues were identified by the Coverity scanner and are
addressed in this patch. None of these appear to be security issues
and many are mostly cosmetic changes.
Short comments for each of the changes follows.
Correct the semi-colon placement in be-secure.c regarding SSL retries.
Remove a useless comparison-to-NULL in proc.c (value is dereferenced
prior to this check and therefore can't be NULL).
Add checking of chmod() return values to initdb.
Fix a couple minor memory leaks in initdb.
Fix memory leak in pg_ctl- involves free'ing the config file contents.
Use an int to capture fgetc() return instead of an enum in pg_dump.
Fix minor memory leaks in pg_dump.
(note minor change to convertOperatorReference()'s API)
Check fclose()/remove() return codes in psql.
Check fstat(), find_my_exec() return codes in psql.
Various ECPG memory leak fixes.
Check find_my_exec() return in ECPG.
Explicitly ignore pqFlush return in libpq error-path.
Change PQfnumber() to avoid doing an strdup() when no changes required.
Remove a few useless check-against-NULL's (value deref'd beforehand).
Check rmtree(), malloc() results in pg_regress.
Also check get_alternative_expectfile() return in pg_regress.
|
|
|
|
|
| |
Christian Kruse, reviewed by Andres Freund and myself, with further
minor adjustments by me.
|
|
|
|
| |
Vik Fearing
|
|
|
|
|
|
| |
Detected by clang's -Wmissing-variable-declarations.
From: Andres Freund <andres@anarazel.de>
|
|
|
|
| |
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replication slots are a crash-safe data structure which can be created
on either a master or a standby to prevent premature removal of
write-ahead log segments needed by a standby, as well as (with
hot_standby_feedback=on) pruning of tuples whose removal would cause
replication conflicts. Slots have some advantages over existing
techniques, as explained in the documentation.
In a few places, we refer to the type of replication slots introduced
by this patch as "physical" slots, because forthcoming patches for
logical decoding will also have slots, but with somewhat different
properties.
Andres Freund and Robert Haas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Evidence from buildfarm member crake suggests that the new test_shm_mq
module is routinely crashing the server due to the arrival of a SIGUSR1
after the shared memory segment has been unmapped. Although processes
using the new dynamic background worker facilities are more likely to
receive a SIGUSR1 around this time, the problem is also possible on older
branches, so I'm back-patching the parts of this change that apply to
older branches as far as they apply.
It's already generally the case that code checks whether these pointers
are NULL before deferencing them, so the important thing is mostly to
make sure that they do get set to NULL before they become invalid. But
in master, there's one case in procsignal_sigusr1_handler that lacks a
NULL guard, so add that.
Patch by me; review by Tom Lane.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to store lwlocks as part of some other data
structure in the main shared memory segment, or in a dynamic shared
memory segment. There is still a main LWLock array and this patch does
not move anything out of it, but it provides necessary infrastructure
for doing that in the future.
This change is likely to increase the size of LWLockPadded on some
platforms, especially 32-bit platforms where it was previously only
16 bytes.
Patch by me. Review by Andres Freund and KaiGai Kohei.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since C99, it's been standard for printf and friends to accept a "z" size
modifier, meaning "whatever size size_t has". Up to now we've generally
dealt with printing size_t values by explicitly casting them to unsigned
long and using the "l" modifier; but this is really the wrong thing on
platforms where pointers are wider than longs (such as Win64). So let's
start using "z" instead. To ensure we can do that on all platforms, teach
src/port/snprintf.c to understand "z", and add a configure test to force
use of that implementation when the platform's version doesn't handle "z".
Having done that, modify a bunch of places that were using the
unsigned-long hack to use "z" instead. This patch doesn't pretend to have
gotten everyplace that could benefit, but it catches many of them. I made
an effort in particular to ensure that all uses of the same error message
text were updated together, so as not to increase the number of
translatable strings.
It's possible that this change will result in format-string warnings from
pre-C99 compilers. We might have to reconsider if there are any popular
compilers that will warn about this; but let's start by seeing what the
buildfarm thinks.
Andres Freund, with a little additional work by me
|
|
|
|
|
|
|
|
|
| |
Previously, we did this just once per checkpoint, but that could make
Hot Standby take a long time to initialize. To avoid busying an
otherwise-idle system, we don't do this if no WAL has been written
since we did it last.
Andres Freund
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ordinary operation, VACUUM must be careful to take a cleanup lock on
each leaf page of a btree index; this ensures that no indexscans could
still be "in flight" to heap tuples due to be deleted. (Because of
possible index-tuple motion due to concurrent page splits, it's not enough
to lock only the pages we're deleting index tuples from.) In Hot Standby,
the WAL replay process must likewise lock every leaf page. There were
several bugs in the code for that:
* The replay scan might come across unused, all-zero pages in the index.
While btree_xlog_vacuum itself did the right thing (ie, nothing) with
such pages, xlogutils.c supposed that such pages must be corrupt and
would throw an error. This accounts for various reports of replication
failures with "PANIC: WAL contains references to invalid pages". To
fix, add a ReadBufferMode value that instructs XLogReadBufferExtended
not to complain when we're doing this.
* btree_xlog_vacuum performed the extra locking if standbyState ==
STANDBY_SNAPSHOT_READY, but that's not the correct test: we won't open up
for hot standby queries until the database has reached consistency, and
we don't want to do the extra locking till then either, for fear of reading
corrupted pages (which bufmgr.c would complain about). Fix by exporting a
new function from xlog.c that will report whether we're actually in hot
standby replay mode.
* To ensure full coverage of the index in the replay scan, btvacuumscan
would emit a dummy WAL record for the last page of the index, if no
vacuuming work had been done on that page. However, if the last page
of the index is all-zero, that would result in corruption of said page,
since the functions called on it weren't prepared to handle that case.
There's no need to lock any such pages, so change the logic to target
the last normal leaf page instead.
The first two of these bugs were diagnosed by Andres Freund, the other one
by me. Fixes based on ideas from Heikki Linnakangas and myself.
This has been wrong since Hot Standby was introduced, so back-patch to 9.0.
|
|
|
|
|
|
|
|
|
| |
This code provides infrastructure for user backends to communicate
relatively easily with background workers. The message queue is
structured as a ring buffer and allows messages of arbitary length
to be sent and received.
Patch by me. Review by KaiGai Kohei and Andres Freund.
|
|
|
|
|
|
|
|
|
|
| |
This interface is intended to make it simple to divide a dynamic shared
memory segment into different regions with distinct purposes. It
therefore serves much the same purpose that ShmemIndex accomplishes for
the main shared memory segment, but it is intended to be more
lightweight.
Patch by me. Review by Andres Freund.
|
|
|
|
|
|
|
|
|
| |
Minor improvement to commit daa7527afc2274432094ebe7ceb03aa41f916607:
s_lock.h no longer has any need to mention PGSemaphoreData, so we can
rip out the #include that supplies that. In a non-HAVE_SPINLOCKS
build, this doesn't really buy much since we still need the #include
in spin.h --- but everywhere else, this reduces #include footprint by
some trifle, and helps keep the different locking facilities separate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them. This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.
One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server. Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores. Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.
Patch by me; review by Tom Lane.
|
|
|
|
|
| |
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of changing the tuple xmin to FrozenTransactionId, the combination
of HEAP_XMIN_COMMITTED and HEAP_XMIN_INVALID, which were previously never
set together, is now defined as HEAP_XMIN_FROZEN. A variety of previous
proposals to freeze tuples opportunistically before vacuum_freeze_min_age
is reached have foundered on the objection that replacing xmin by
FrozenTransactionId might hinder debugging efforts when things in this
area go awry; this patch is intended to solve that problem by keeping
the XID around (but largely ignoring the value to which it is set).
Third-party code that checks for HEAP_XMIN_INVALID on tuples where
HEAP_XMIN_COMMITTED might be set will be broken by this change. To fix,
use the new accessor macros in htup_details.h rather than consulting the
bits directly. HeapTupleHeaderGetXmin has been modified to return
FrozenTransactionId when the infomask bits indicate that the tuple is
frozen; use HeapTupleHeaderGetRawXmin when you already know that the
tuple isn't marked commited or frozen, or want the raw value anyway.
We currently do this in routines that display the xmin for user consumption,
in tqual.c where it's known to be safe and important for the avoidance of
extra cycles, and in the function-caching code for various procedural
languages, which shouldn't invalidate the cache just because the tuple
gets frozen.
Robert Haas and Andres Freund
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just as backends must clean up their shared memory state (releasing
lwlocks, buffer pins, etc.) before exiting, they must also perform
any similar cleanups related to dynamic shared memory segments they
have mapped before unmapping those segments. So add a mechanism to
ensure that.
Existing on_shmem_exit hooks include both "user level" cleanup such
as transaction abort and removal of leftover temporary relations and
also "low level" cleanup that forcibly released leftover shared
memory resources. On-detach callbacks should run after the first
group but before the second group, so create a new before_shmem_exit
function for registering the early callbacks and keep on_shmem_exit
for the regular callbacks. (An earlier draft of this patch added an
additional argument to on_shmem_exit, but that had a much larger
footprint and probably a substantially higher risk of breaking third
party code for no real gain.)
Patch by me, reviewed by KaiGai Kohei and Andres Freund.
|
|
|
|
|
| |
Per "clang -Wmissing-variable-declarations" output, posted by Andres Freund.
I didn't silence all those warnings, though, only the most obvious cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents a possible longjmp out of the signal handler if a timeout
or SIGINT occurs while something within the handler has transiently set
ImmediateInterruptOK. For safety we must hold off the timeout or cancel
error until we're back in mainline, or at least till we reach the end of
the signal handler when ImmediateInterruptOK was true at entry. This
syncs these functions with the logic now present in handle_sig_alarm.
AFAICT there is no live bug here in 9.0 and up, because I don't think we
currently can wait for any heavyweight lock inside these functions, and
there is no other code (except read-from-client) that will turn on
ImmediateInterruptOK. However, that was not true pre-9.0: in older
branches ProcessIncomingNotify might block trying to lock pg_listener, and
then a SIGINT could lead to undesirable control flow. It might be all
right anyway given the relatively narrow code ranges in which NOTIFY
interrupts are enabled, but for safety's sake I'm back-patching this.
|
|
|
|
| |
Plus one instance of "to to" in the docs.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
WAL records of hint bit updates is useful to tools that want to examine
which pages have been modified. In particular, this is required to make
the pg_rewind tool safe (without checksums).
This can also be used to test how much extra WAL-logging would occur if
you enabled checksums, without actually enabling them (which you can't
currently do without re-initdb'ing).
Sawada Masahiko, docs by Samrat Revagade. Reviewed by Dilip Kumar, with
further changes by me.
|
|
|
|
| |
Per complaint from Tom Lane.
|
|
|
|
| |
Error noted by Heikki Linnakangas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The various places that transferred fast-path locks to the main lock table
neglected to release the PGPROC's backendLock if SetupLockInTable failed
due to being out of shared memory. In most cases this is no big deal since
ensuing error cleanup would release all held LWLocks anyway. But there are
some hot-standby functions that don't consider failure of
FastPathTransferRelationLocks to be a hard error, and in those cases this
oversight could lead to system lockup. For consistency, make all of these
places look the same as FastPathTransferRelationLocks.
Noted while looking for the cause of Dan Wood's bugs --- this wasn't it,
but it's a bug anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prevent handle_sig_alarm from losing control partway through due to a query
cancel (either an asynchronous SIGINT, or a cancel triggered by one of the
timeout handler functions). That would at least result in failure to
schedule any required future interrupt, and might result in actual
corruption of timeout.c's data structures, if the interrupt happened while
we were updating those.
We could still lose control if an asynchronous SIGINT arrives just as the
function is entered. This wouldn't break any data structures, but it would
have the same effect as if the SIGALRM interrupt had been silently lost:
we'd not fire any currently-due handlers, nor schedule any new interrupt.
To forestall that scenario, forcibly reschedule any pending timer interrupt
during AbortTransaction and AbortSubTransaction. We can avoid any extra
kernel call in most cases by not doing that until we've allowed
LockErrorCleanup to kill the DEADLOCK_TIMEOUT and LOCK_TIMEOUT events.
Another hazard is that some platforms (at least Linux and *BSD) block a
signal before calling its handler and then unblock it on return. When we
longjmp out of the handler, the unblock doesn't happen, and the signal is
left blocked indefinitely. Again, we can fix that by forcibly unblocking
signals during AbortTransaction and AbortSubTransaction.
These latter two problems do not manifest when the longjmp reaches
postgres.c, because the error recovery code there kills all pending timeout
events anyway, and it uses sigsetjmp(..., 1) so that the appropriate signal
mask is restored. So errors thrown outside any transaction should be OK
already, and cleaning up in AbortTransaction and AbortSubTransaction should
be enough to fix these issues. (We're assuming that any code that catches
a query cancel error and doesn't re-throw it will do at least a
subtransaction abort to clean up; but that was pretty much required already
by other subsystems.)
Lastly, ProcSleep should not clear the LOCK_TIMEOUT indicator flag when
disabling that event: if a lock timeout interrupt happened after the lock
was granted, the ensuing query cancel is still going to happen at the next
CHECK_FOR_INTERRUPTS, and we want to report it as a lock timeout not a user
cancel.
Per reports from Dan Wood.
Back-patch to 9.3 where the new timeout handling infrastructure was
introduced. We may at some point decide to back-patch the signal
unblocking changes further, but I'll desist from that until we hear
actual field complaints about it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have for a long time checked the head pointer of each of the backend's
proclock lists and skipped acquiring the corresponding locktable partition
lock if the head pointer was NULL. This was safe enough in the days when
proclock lists were changed only by the owning backend, but it is pretty
questionable now that the fast-path patch added cases where backends add
entries to other backends' proclock lists. However, we don't really wish
to revert to locking each partition lock every time, because in simple
transactions that would add a lot of useless lock/unlock cycles on
already-heavily-contended LWLocks. Fortunately, the only way that another
backend could be modifying our proclock list at this point would be if it
was promoting a formerly fast-path lock of ours; and any such lock must be
one that we'd decided not to delete in the previous loop over the locallock
table. So it's okay if we miss seeing it in this loop; we'd just decide
not to delete it again. However, once we've detected a non-empty list,
we'd better re-fetch the list head pointer after acquiring the partition
lock. This guards against possibly fetching a corrupt-but-non-null pointer
if pointer fetch/store isn't atomic. It's not clear if any practical
architectures are like that, but we've never assumed that before and don't
wish to start here. In any case, the situation certainly deserves a code
comment.
While at it, refactor the partition traversal loop to use a for() construct
instead of a while() loop with goto's.
Back-patch, just in case the risk is real and not hypothetical.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When acquiring a lock in fast-path mode, we must reset the locallock
object's lock and proclock fields to NULL. They are not necessarily that
way to start with, because the locallock could be left over from a failed
lock acquisition attempt earlier in the transaction. Failure to do this
led to all sorts of interesting misbehaviors when LockRelease tried to
clean up no-longer-related lock and proclock objects in shared memory.
Per report from Dan Wood.
In passing, modify LockRelease to elog not just Assert if it doesn't find
lock and proclock objects for a formerly fast-path lock, matching the code
in FastPathGetRelationLockEntry and LockRefindAndRelease. This isn't a
bug but it will help in diagnosing any future bugs in this area.
Also, modify FastPathTransferRelationLocks and FastPathGetRelationLockEntry
to break out of their loops over the fastpath array once they've found the
sole matching entry. This was inconsistently done in some search loops
and not others.
Improve assorted related comments, too.
Back-patch to 9.2 where the fast-path mechanism was introduced.
|
|
|
|
|
|
|
|
|
|
| |
Correct an obsolete statement that no backend touches another backend's
PROCLOCK lists. This was probably wrong even when written (the deadlock
checker looks at everybody's lists), and it's certainly quite wrong now
that fast-path locking can require creation of lock and proclock objects
on behalf of another backend. Also improve some statements in the hot
standby explanation, and do one or two other trivial bits of wordsmithing/
reformatting.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These bugs can cause data loss on standbys started with hot_standby=on at
the moment they start to accept read only queries, by marking committed
transactions as uncommited. The likelihood of such corruptions is small
unless the primary has a high transaction rate.
5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic
by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state
was reached, missing the fact that both clog and subtrans are written to
before that. This only failed to fail in common cases because the usage
of ExtendCLOG in procarray.c was superflous since clog extensions are
actually WAL logged.
f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing
extensions of pg_subtrans due to the former commit's changes - which are
not WAL logged - by performing the extensions when switching to a state
> STANDBY_INITIALIZED and not performing xid assignments before that -
again missing the fact that ExtendCLOG is unneccessary - but screwed up
twice: Once because latestObservedXid wasn't updated anymore in that
state due to the earlier commit and once by having an off-by-one error in
the loop performing extensions. This means that whenever a
CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed
between the start of the checkpoint recovery started from and the first
xl_running_xact record old transactions commit bits in pg_clog could be
overwritten if they started and committed in that window.
Fix this mess by not performing ExtendCLOG() in HS at all anymore since
it's unneeded and evidently dangerous and by performing subtrans
extensions even before reaching STANDBY_SNAPSHOT_PENDING.
Analysis and patch by Andres Freund. Reported by Christophe Pettus.
Backpatch down to 9.0, like the previous commit that caused this.
|
|
|
|
|
| |
Set per file type attributes in .gitattributes to fine-tune whitespace
checks. With the associated cleanups, the tree is now clean for git
|
|
|
|
|
|
|
| |
This avoids warnings from more-anal-than-average compilers, and might
prevent hidden syntax problems in the future.
Andres Freund
|
|
|
|
| |
Per report from Peter Eisentraut.
|
|
|
|
|
|
|
|
|
|
|
| |
Callers expect that they only have to set the right resource owner when
creating a BufFile, not during subsequent operations on it. While we could
insist this be fixed at the caller level, it seems more sensible for the
BufFile to take care of it. Without this, some temp files belonging to
a BufFile can go away too soon, eg at the end of a subtransaction,
leading to errors or crashes.
Reported and fixed by Andres Freund. Back-patch to all active branches.
|
|
|
|
|
|
|
| |
Apparently, shifts greater than or equal to the width of the type
are undefined, and can surprisingly produce a non-zero value.
Amit Kapila, with a comment by me.
|
|
|
|
| |
This is more consistent with what we do elsewhere.
|
|
|
|
|
|
|
|
|
|
|
|
| |
To wit,
bgworker.c: In function `RegisterDynamicBackgroundWorker':
bgworker.c:761: warning: `generation' might be used uninitialized in this function
dsm_impl.c: In function `dsm_impl_op':
dsm_impl.c:197: warning: control reaches end of non-void function
Neither of these represent actual bugs, but we may as well tweak the code
so that more compilers can tell that. This won't change the generated code
on compilers that do recognize that the cases are unreachable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All of these platforms are very much obsolete.
As far as I can determine, the last version of SINIX, later renamed
Reliant, occurred some time between 2002 and 2005.
The last release of SunOS that would run on a sun3 was released in
November of 1991; the last release of OpenBSD which supported that
platform was in 2001. The highest clock speed of any processor in
the family was 25MHz.
The NS32K (national semiconductor 320xx) architecture was retired
in 1990.
Support can be re-added if a maintainer emerges for any of these
platforms, but it seems unlikely.
Reviewed by Andres Freund.
|
|
|
|
|
|
|
| |
This is the behavior of the other implementations, and the behavior
expected by the callers of this function.
Amit Kapila
|
|
|
|
| |
Per buildfarm.
|
|
|
|
|
| |
Patch by myself and Amit Kapila. Design help from Noah Misch. Review
by Andres Freund.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. In heap_hot_search_buffer(), the PredicateLockTuple() call is passed
wrong offset number. heapTuple->t_self is set to the tid of the first
tuple in the chain that's visited, not the one actually being read.
2. CheckForSerializableConflictIn() uses the tuple's t_ctid field
instead of t_self to check for exiting predicate locks on the tuple. If
the tuple was updated, but the updater rolled back, t_ctid points to the
aborted dead tuple.
Reported by Hannu Krosing. Backpatch to 9.1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a tuple was frozen while its predicate locks mattered,
read-write dependencies could be missed, resulting in failure to
detect conflicts which could lead to anomalies in committed
serializable transactions.
This field was added to the tag when we still thought that it was
necessary to carry locks forward to a new version of an updated
row. That was later proven to be unnecessary, which allowed
simplification of the code, but elimination of xmin from the tag
was missed at the time.
Per report and analysis by Heikki Linnakangas.
Backpatch to 9.1.
|
|
|
|
|
|
| |
This is in support of a future REINDEX CONCURRENTLY feature.
Michael Paquier
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lo_open registers the currently active snapshot, and checks if the
large object exists after that. Normally, snapshots registered by lo_open
are unregistered at end of transaction when the lo descriptor is closed, but
if we error out before the lo descriptor is added to the list of open
descriptors, it is leaked. Fix by moving the snapshot registration to after
checking if the large object exists.
Reported by Pavel Stehule. Backpatch to 8.4. The snapshot registration
system was introduced in 8.4, so prior versions are not affected (and not
supported, anyway).
|
|
|
|
|
|
| |
This has been unused since commit 8563ccae2caf.
Noted by Antonin Houska
|
|
|
|
| |
Andres Freund
|