diff options
author | Simon Riggs <simon@2ndQuadrant.com> | 2009-12-19 01:32:45 +0000 |
---|---|---|
committer | Simon Riggs <simon@2ndQuadrant.com> | 2009-12-19 01:32:45 +0000 |
commit | efc16ea520679d713d98a2c7bf1453c4ff7b91ec (patch) | |
tree | 6a39d2af0704a36281dc7df3ec10823eb3e6de75 /src/backend/access/heap/pruneheap.c | |
parent | 78a09145e0f8322e625bbc7d69fcb865ce4f3034 (diff) | |
download | postgresql-efc16ea520679d713d98a2c7bf1453c4ff7b91ec.tar.gz postgresql-efc16ea520679d713d98a2c7bf1453c4ff7b91ec.zip |
Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
Diffstat (limited to 'src/backend/access/heap/pruneheap.c')
-rw-r--r-- | src/backend/access/heap/pruneheap.c | 22 |
1 files changed, 19 insertions, 3 deletions
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c index 71ea689d0e6..1ea0899acc8 100644 --- a/src/backend/access/heap/pruneheap.c +++ b/src/backend/access/heap/pruneheap.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/access/heap/pruneheap.c,v 1.18 2009/06/11 14:48:53 momjian Exp $ + * $PostgreSQL: pgsql/src/backend/access/heap/pruneheap.c,v 1.19 2009/12/19 01:32:32 sriggs Exp $ * *------------------------------------------------------------------------- */ @@ -30,7 +30,8 @@ typedef struct { TransactionId new_prune_xid; /* new prune hint value for page */ - int nredirected; /* numbers of entries in arrays below */ + TransactionId latestRemovedXid; /* latest xid to be removed by this prune */ + int nredirected; /* numbers of entries in arrays below */ int ndead; int nunused; /* arrays that accumulate indexes of items to be changed */ @@ -85,6 +86,14 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin) return; /* + * We can't write WAL in recovery mode, so there's no point trying to + * clean the page. The master will likely issue a cleaning WAL record + * soon anyway, so this is no particular loss. + */ + if (RecoveryInProgress()) + return; + + /* * We prune when a previous UPDATE failed to find enough space on the page * for a new tuple version, or when free space falls below the relation's * fill-factor target (but not less than 10%). @@ -176,6 +185,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin, * of our working state. */ prstate.new_prune_xid = InvalidTransactionId; + prstate.latestRemovedXid = InvalidTransactionId; prstate.nredirected = prstate.ndead = prstate.nunused = 0; memset(prstate.marked, 0, sizeof(prstate.marked)); @@ -257,7 +267,7 @@ heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin, prstate.redirected, prstate.nredirected, prstate.nowdead, prstate.ndead, prstate.nowunused, prstate.nunused, - redirect_move); + prstate.latestRemovedXid, redirect_move); PageSetLSN(BufferGetPage(buffer), recptr); PageSetTLI(BufferGetPage(buffer), ThisTimeLineID); @@ -395,6 +405,8 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum, == HEAPTUPLE_DEAD && !HeapTupleHeaderIsHotUpdated(htup)) { heap_prune_record_unused(prstate, rootoffnum); + HeapTupleHeaderAdvanceLatestRemovedXid(htup, + &prstate->latestRemovedXid); ndeleted++; } @@ -520,7 +532,11 @@ heap_prune_chain(Relation relation, Buffer buffer, OffsetNumber rootoffnum, * find another DEAD tuple is a fairly unusual corner case.) */ if (tupdead) + { latestdead = offnum; + HeapTupleHeaderAdvanceLatestRemovedXid(htup, + &prstate->latestRemovedXid); + } else if (!recent_dead) break; |