diff options
author | Noah Misch <noah@leadboat.com> | 2025-01-25 11:28:14 -0800 |
---|---|---|
committer | Noah Misch <noah@leadboat.com> | 2025-01-25 11:28:19 -0800 |
commit | 25e99483c4116313b678cc94df53d3ea28977752 (patch) | |
tree | faad3e59f6d02b27387492024628519a4c687466 | |
parent | 7b3259bd7cf37da472ff5c2ce25fe0aa552e6fe0 (diff) | |
download | postgresql-25e99483c4116313b678cc94df53d3ea28977752.tar.gz postgresql-25e99483c4116313b678cc94df53d3ea28977752.zip |
At update of non-LP_NORMAL TID, fail instead of corrupting page header.
The right mix of DDL and VACUUM could corrupt a catalog page header such
that PageIsVerified() durably fails, requiring a restore from backup.
This affects only catalogs that both have a syscache and have DDL code
that uses syscache tuples to construct updates. One of the test
permutations shows a variant not yet fixed.
This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with
TM_Deleted. I think core and PGXN are indifferent to that.
Per bug #17821 from Alexander Lakhin. Back-patch to v13 (all supported
versions). The test case is v17+, since it uses INJECTION_POINT.
Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
-rw-r--r-- | src/backend/access/heap/heapam.c | 45 | ||||
-rw-r--r-- | src/include/access/tableam.h | 3 |
2 files changed, 46 insertions, 2 deletions
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 3a467705196..1233aaf5cc4 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -72,6 +72,7 @@ #include "utils/relcache.h" #include "utils/snapmgr.h" #include "utils/spccache.h" +#include "utils/syscache.h" static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup, @@ -3322,7 +3323,49 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); lp = PageGetItemId(page, ItemPointerGetOffsetNumber(otid)); - Assert(ItemIdIsNormal(lp)); + + /* + * Usually, a buffer pin and/or snapshot blocks pruning of otid, ensuring + * we see LP_NORMAL here. When the otid origin is a syscache, we may have + * neither a pin nor a snapshot. Hence, we may see other LP_ states, each + * of which indicates concurrent pruning. + * + * Failing with TM_Updated would be most accurate. However, unlike other + * TM_Updated scenarios, we don't know the successor ctid in LP_UNUSED and + * LP_DEAD cases. While the distinction between TM_Updated and TM_Deleted + * does matter to SQL statements UPDATE and MERGE, those SQL statements + * hold a snapshot that ensures LP_NORMAL. Hence, the choice between + * TM_Updated and TM_Deleted affects only the wording of error messages. + * Settle on TM_Deleted, for two reasons. First, it avoids complicating + * the specification of when tmfd->ctid is valid. Second, it creates + * error log evidence that we took this branch. + * + * Since it's possible to see LP_UNUSED at otid, it's also possible to see + * LP_NORMAL for a tuple that replaced LP_UNUSED. If it's a tuple for an + * unrelated row, we'll fail with "duplicate key value violates unique". + * XXX if otid is the live, newer version of the newtup row, we'll discard + * changes originating in versions of this catalog row after the version + * the caller got from syscache. See syscache-update-pruned.spec. + */ + if (!ItemIdIsNormal(lp)) + { + Assert(RelationSupportsSysCache(RelationGetRelid(relation))); + + UnlockReleaseBuffer(buffer); + Assert(!have_tuple_lock); + if (vmbuffer != InvalidBuffer) + ReleaseBuffer(vmbuffer); + tmfd->ctid = *otid; + tmfd->xmax = InvalidTransactionId; + tmfd->cmax = InvalidCommandId; + + bms_free(hot_attrs); + bms_free(key_attrs); + bms_free(id_attrs); + /* modified_attrs not yet initialized */ + bms_free(interesting_attrs); + return TM_Deleted; + } /* * Fill in enough data in oldtup for HeapDetermineColumnsInfo to work diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h index 7465f037860..f1e6f554aa7 100644 --- a/src/include/access/tableam.h +++ b/src/include/access/tableam.h @@ -113,7 +113,8 @@ typedef enum TM_Result * * xmax is the outdating transaction's XID. If the caller wants to visit the * replacement tuple, it must check that this matches before believing the - * replacement is really a match. + * replacement is really a match. This is InvalidTransactionId if the target + * was !LP_NORMAL (expected only for a TID retrieved from syscache). * * cmax is the outdating command's CID, but only when the failure code is * TM_SelfModified (i.e., something in the current transaction outdated the |