diff options
author | Kevin Grittner <kgrittn@postgresql.org> | 2012-09-04 21:13:11 -0500 |
---|---|---|
committer | Kevin Grittner <kgrittn@postgresql.org> | 2012-09-04 21:13:11 -0500 |
commit | cdf91edba9f92306b49afcc6bfa6be0e56eb0688 (patch) | |
tree | 980f2216318ed14c394a114a3fe2ad12ff277222 /src/backend/executor/nodeIndexonlyscan.c | |
parent | c63f309cca07c0570494a8f36092633635990db8 (diff) | |
download | postgresql-cdf91edba9f92306b49afcc6bfa6be0e56eb0688.tar.gz postgresql-cdf91edba9f92306b49afcc6bfa6be0e56eb0688.zip |
Fix serializable mode with index-only scans.
Serializable Snapshot Isolation used for serializable transactions
depends on acquiring SIRead locks on all heap relation tuples which
are used to generate the query result, so that a later delete or
update of any of the tuples can flag a read-write conflict between
transactions. This is normally handled in heapam.c, with tuple level
locking. Since an index-only scan avoids heap access in many cases,
building the result from the index tuple, the necessary predicate
locks were not being acquired for all tuples in an index-only scan.
To prevent problems with tuple IDs which are vacuumed and re-used
while the transaction still matters, the xmin of the tuple is part of
the tag for the tuple lock. Since xmin is not available to the
index-only scan for result rows generated from the index tuples, it
is not possible to acquire a tuple-level predicate lock in such
cases, in spite of having the tid. If we went to the heap to get the
xmin value, it would no longer be an index-only scan. Rather than
prohibit index-only scans under serializable transaction isolation,
we acquire an SIRead lock on the page containing the tuple, when it
was not necessary to visit the heap for other reasons.
Backpatch to 9.2.
Kevin Grittner and Tom Lane
Diffstat (limited to 'src/backend/executor/nodeIndexonlyscan.c')
-rw-r--r-- | src/backend/executor/nodeIndexonlyscan.c | 16 |
1 files changed, 15 insertions, 1 deletions
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c index 38078763f57..e72ebc8c3a8 100644 --- a/src/backend/executor/nodeIndexonlyscan.c +++ b/src/backend/executor/nodeIndexonlyscan.c @@ -30,6 +30,7 @@ #include "executor/nodeIndexonlyscan.h" #include "executor/nodeIndexscan.h" #include "storage/bufmgr.h" +#include "storage/predicate.h" #include "utils/memutils.h" #include "utils/rel.h" @@ -52,7 +53,6 @@ IndexOnlyNext(IndexOnlyScanState *node) ExprContext *econtext; ScanDirection direction; IndexScanDesc scandesc; - HeapTuple tuple; TupleTableSlot *slot; ItemPointer tid; @@ -78,6 +78,8 @@ IndexOnlyNext(IndexOnlyScanState *node) */ while ((tid = index_getnext_tid(scandesc, direction)) != NULL) { + HeapTuple tuple = NULL; + /* * We can skip the heap fetch if the TID references a heap page on * which all tuples are known visible to everybody. In any case, @@ -147,6 +149,18 @@ IndexOnlyNext(IndexOnlyScanState *node) } } + /* + * Predicate locks for index-only scans must be acquired at the page + * level when the heap is not accessed, since tuple-level predicate + * locks need the tuple's xmin value. If we had to visit the tuple + * anyway, then we already have the tuple-level lock and can skip the + * page lock. + */ + if (tuple == NULL) + PredicateLockPage(scandesc->heapRelation, + ItemPointerGetBlockNumber(tid), + estate->es_snapshot); + return slot; } |