aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
diff options
context:
space:
mode:
authorNoah Misch <noah@leadboat.com>2020-08-15 10:15:53 -0700
committerNoah Misch <noah@leadboat.com>2020-08-15 10:15:57 -0700
commitd4031d78460cbbb4ed2fb7be635f84bea0e9a0c1 (patch)
treed7ee345e4522f52d62b51f33db15d96ba9a8535a /src/backend/access
parent9d472b51e98777102d72f8ccdfb8cef10e087f74 (diff)
downloadpostgresql-d4031d78460cbbb4ed2fb7be635f84bea0e9a0c1.tar.gz
postgresql-d4031d78460cbbb4ed2fb7be635f84bea0e9a0c1.zip
Prevent concurrent SimpleLruTruncate() for any given SLRU.
The SimpleLruTruncate() header comment states the new coding rule. To achieve this, add locktype "frozenid" and two LWLocks. This closes a rare opportunity for data loss, which manifested as "apparent wraparound" or "could not access status of transaction" errors. Data loss is more likely in pg_multixact, due to released branches' thin margin between multiStopLimit and multiWrapLimit. If a user's physical replication primary logged ": apparent wraparound" messages, the user should rebuild standbys of that primary regardless of symptoms. At less risk is a cluster having emitted "not accepting commands" errors or "must be vacuumed" warnings at some point. One can test a cluster for this data loss by running VACUUM FREEZE in every database. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
Diffstat (limited to 'src/backend/access')
-rw-r--r--src/backend/access/transam/slru.c8
-rw-r--r--src/backend/access/transam/subtrans.c4
2 files changed, 10 insertions, 2 deletions
diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c
index fad5d363e32..67387979bdb 100644
--- a/src/backend/access/transam/slru.c
+++ b/src/backend/access/transam/slru.c
@@ -1163,6 +1163,14 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
/*
* Remove all segments before the one holding the passed page number
+ *
+ * All SLRUs prevent concurrent calls to this function, either with an LWLock
+ * or by calling it only as part of a checkpoint. Mutual exclusion must begin
+ * before computing cutoffPage. Mutual exclusion must end after any limit
+ * update that would permit other backends to write fresh data into the
+ * segment immediately preceding the one containing cutoffPage. Otherwise,
+ * when the SLRU is quite full, SimpleLruTruncate() might delete that segment
+ * after it has accrued freshly-written data.
*/
void
SimpleLruTruncate(SlruCtl ctl, int cutoffPage)
diff --git a/src/backend/access/transam/subtrans.c b/src/backend/access/transam/subtrans.c
index 4faa21f5aef..ef63b6d98a4 100644
--- a/src/backend/access/transam/subtrans.c
+++ b/src/backend/access/transam/subtrans.c
@@ -347,8 +347,8 @@ ExtendSUBTRANS(TransactionId newestXact)
/*
* Remove all SUBTRANS segments before the one holding the passed transaction ID
*
- * This is normally called during checkpoint, with oldestXact being the
- * oldest TransactionXmin of any running transaction.
+ * oldestXact is the oldest TransactionXmin of any running transaction. This
+ * is called only during checkpoint.
*/
void
TruncateSUBTRANS(TransactionId oldestXact)