diff options
author | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2024-08-12 18:17:56 -0400 |
---|---|---|
committer | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2024-08-12 18:17:56 -0400 |
commit | 305db95434c9f68811f2489ec57d5a0435608c46 (patch) | |
tree | adda2128280eef6240a7981780968a1d630608a4 /src | |
parent | 16e67bc5f98f2f5691fb5c01d5a8310de1c62b81 (diff) | |
download | postgresql-305db95434c9f68811f2489ec57d5a0435608c46.tar.gz postgresql-305db95434c9f68811f2489ec57d5a0435608c46.zip |
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by
DROP, this could cause a problem for a concurrent transaction
recomputing the partition descriptor when running a prepared statement,
because it tries to dereference a pointer to a tuple that's not found in
a catalog scan.
The existing retry logic added in commit dbca3469ebf8 is sufficient to
cope with the overall problem, provided we don't try to dereference a
non-existant heap tuple.
Arguably, the code in RelationBuildPartitionDesc() has been wrong all
along, since no check was added in commit 898e5e3290a7 against receiving
a NULL tuple from the catalog scan; that bug has only become
user-visible with DETACH CONCURRENTLY which was added in branch 14.
Therefore, even though there's no known mechanism to cause a crash
because of this, backpatch the addition of such a check to all supported
branches. In branches prior to 14, this would cause the code to fail
with a "missing relpartbound for relation XYZ" error instead of
crashing; that's okay, because there are no reports of such behavior
anyway.
Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
Diffstat (limited to 'src')
-rw-r--r-- | src/backend/partitioning/partdesc.c | 30 |
1 files changed, 22 insertions, 8 deletions
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c index 0b805d193b6..91d1e83ea2b 100644 --- a/src/backend/partitioning/partdesc.c +++ b/src/backend/partitioning/partdesc.c @@ -210,6 +210,10 @@ retry: * shared queue. We solve this problem by reading pg_class directly * for the desired tuple. * + * If the partition recently detached is also dropped, we get no tuple + * from the scan. In that case, we also retry, and next time through + * here, we don't see that partition anymore. + * * The other problem is that DETACH CONCURRENTLY is in the process of * removing a partition, which happens in two steps: first it marks it * as "detach pending", commits, then unsets relpartbound. If @@ -224,8 +228,6 @@ retry: Relation pg_class; SysScanDesc scan; ScanKeyData key[1]; - Datum datum; - bool isnull; pg_class = table_open(RelationRelationId, AccessShareLock); ScanKeyInit(&key[0], @@ -234,17 +236,29 @@ retry: ObjectIdGetDatum(inhrelid)); scan = systable_beginscan(pg_class, ClassOidIndexId, true, NULL, 1, key); + + /* + * We could get one tuple from the scan (the normal case), or zero + * tuples if the table has been dropped meanwhile. + */ tuple = systable_getnext(scan); - datum = heap_getattr(tuple, Anum_pg_class_relpartbound, - RelationGetDescr(pg_class), &isnull); - if (!isnull) - boundspec = stringToNode(TextDatumGetCString(datum)); + if (HeapTupleIsValid(tuple)) + { + Datum datum; + bool isnull; + + datum = heap_getattr(tuple, Anum_pg_class_relpartbound, + RelationGetDescr(pg_class), &isnull); + if (!isnull) + boundspec = stringToNode(TextDatumGetCString(datum)); + } systable_endscan(scan); table_close(pg_class, AccessShareLock); /* - * If we still don't get a relpartbound value, then it must be - * because of DETACH CONCURRENTLY. Restart from the top, as + * If we still don't get a relpartbound value (either because + * boundspec is null or because there was no tuple), then it must + * be because of DETACH CONCURRENTLY. Restart from the top, as * explained above. We only do this once, for two reasons: first, * only one DETACH CONCURRENTLY session could affect us at a time, * since each of them would have to wait for the snapshot under |