aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAlvaro Herrera <alvherre@alvh.no-ip.org>2016-11-17 13:31:30 -0300
committerAlvaro Herrera <alvherre@alvh.no-ip.org>2016-11-17 13:31:30 -0300
commitf5d89443203480e39a6a15e64f1950c3b4d3c9a2 (patch)
tree63b27aedeb6b7e277a1fa9e704ba445f628244d6
parenta69e6d9a6cd9fef893f4cf5b29a9ccf1b015c317 (diff)
downloadpostgresql-f5d89443203480e39a6a15e64f1950c3b4d3c9a2.tar.gz
postgresql-f5d89443203480e39a6a15e64f1950c3b4d3c9a2.zip
Avoid pin scan for replay of XLOG_BTREE_VACUUM in all cases
Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the “pin scan” that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here. No tests included. Manual tests using an additional patch to view WAL records and their timing have shown the change in WAL records and their handling has successfully reduced replication delay. This is a back-patch of commits 687f2cd7a015, 3e4b7d87988f, b60284261375 by Simon Riggs, to branches 9.4 and 9.5. No further backpatch is possible because this depends on catalog scans being MVCC. I (Álvaro) additionally updated a slight problem in the README, which explains why this touches the 9.6 and master branches.
-rw-r--r--src/backend/access/nbtree/README5
1 files changed, 3 insertions, 2 deletions
diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README
index 067d15c8039..a3f11da8d5a 100644
--- a/src/backend/access/nbtree/README
+++ b/src/backend/access/nbtree/README
@@ -521,11 +521,12 @@ because it allows running applications to continue while the standby
changes state into a normally running server.
The interlocking required to avoid returning incorrect results from
-MVCC scans is not required on standby nodes. That is because
+non-MVCC scans is not required on standby nodes. That is because
HeapTupleSatisfiesUpdate(), HeapTupleSatisfiesSelf(),
HeapTupleSatisfiesDirty() and HeapTupleSatisfiesVacuum() are only
ever used during write transactions, which cannot exist on the standby.
-This leaves HeapTupleSatisfiesMVCC() and HeapTupleSatisfiesToast().
+MVCC scans are already protected by definition, so HeapTupleSatisfiesMVCC()
+is not a problem. That leaves concern only for HeapTupleSatisfiesToast().
HeapTupleSatisfiesToast() doesn't use MVCC semantics, though that's
because it doesn't need to - if the main heap row is visible then the
toast rows will also be visible. So as long as we follow a toast