diff options
author | Peter Geoghegan <pg@bowt.ie> | 2020-11-17 09:45:56 -0800 |
---|---|---|
committer | Peter Geoghegan <pg@bowt.ie> | 2020-11-17 09:45:56 -0800 |
commit | cf2acaf4dcb5e20204dcec4d698cb4478af533e7 (patch) | |
tree | 7b84bed97e5e8ea72899ca765fa3f10ff345a26c /src/include/access/nbtree.h | |
parent | 7684b6fbed3a0770a0d8fdcbb5cf8b61394de691 (diff) | |
download | postgresql-cf2acaf4dcb5e20204dcec4d698cb4478af533e7.tar.gz postgresql-cf2acaf4dcb5e20204dcec4d698cb4478af533e7.zip |
Deprecate nbtree's BTP_HAS_GARBAGE flag.
Streamline handling of the various strategies that we have to avoid a
page split in nbtinsert.c. When it looks like a leaf page is about to
overflow, we now perform deleting LP_DEAD items and deduplication in one
central place. This greatly simplifies _bt_findinsertloc().
This has an independently useful consequence: nbtree no longer relies on
the BTP_HAS_GARBAGE page level flag/hint for anything important. We
still set and unset the flag in the same way as before, but it's no
longer treated as a gating condition when considering if we should check
for already-set LP_DEAD bits. This happens at the point where the page
looks like it might have to be split anyway, so simply checking the
LP_DEAD bits in passing is practically free. This avoids missing
LP_DEAD bits just because the page-level hint is unset, which is
probably reasonably common (e.g. it happens when VACUUM unsets the
page-level flag without actually removing index tuples whose LP_DEAD-bit
was set recently, after the VACUUM operation began but before it reached
the leaf page in question).
Note that this isn't a big behavioral change compared to PostgreSQL 13.
We were already checking for set LP_DEAD bits regardless of whether the
BTP_HAS_GARBAGE page level flag was set before we considered doing a
deduplication pass. This commit only goes slightly further by doing the
same check for all indexes, even indexes where deduplication won't be
performed.
We don't completely remove the BTP_HAS_GARBAGE flag. We still rely on
it as a gating condition with pg_upgrade'd indexes from before B-tree
version 4/PostgreSQL 12. That makes sense because we sometimes have to
make a choice among pages full of duplicates when inserting a tuple with
pre version 4 indexes. It probably still pays to avoid accessing the
line pointer array of a page there, since it won't yet be clear whether
we'll insert on to the page in question at all, let alone split it as a
result.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz%3DYpc1PDdk8OVJDChGJBjT06%3DA0Mbv9HyTLCsOknGcUFg%40mail.gmail.com
Diffstat (limited to 'src/include/access/nbtree.h')
-rw-r--r-- | src/include/access/nbtree.h | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h index 65d9698b899..e8fecc6026f 100644 --- a/src/include/access/nbtree.h +++ b/src/include/access/nbtree.h @@ -75,7 +75,7 @@ typedef BTPageOpaqueData *BTPageOpaque; #define BTP_META (1 << 3) /* meta-page */ #define BTP_HALF_DEAD (1 << 4) /* empty, but still in tree */ #define BTP_SPLIT_END (1 << 5) /* rightmost page of split group */ -#define BTP_HAS_GARBAGE (1 << 6) /* page has LP_DEAD tuples */ +#define BTP_HAS_GARBAGE (1 << 6) /* page has LP_DEAD tuples (deprecated) */ #define BTP_INCOMPLETE_SPLIT (1 << 7) /* right sibling's downlink is missing */ /* @@ -1027,9 +1027,9 @@ extern void _bt_parallel_advance_array_keys(IndexScanDesc scan); /* * prototypes for functions in nbtdedup.c */ -extern void _bt_dedup_one_page(Relation rel, Buffer buf, Relation heapRel, - IndexTuple newitem, Size newitemsz, - bool checkingunique); +extern void _bt_dedup_pass(Relation rel, Buffer buf, Relation heapRel, + IndexTuple newitem, Size newitemsz, + bool checkingunique); extern void _bt_dedup_start_pending(BTDedupState state, IndexTuple base, OffsetNumber baseoff); extern bool _bt_dedup_save_htid(BTDedupState state, IndexTuple itup); |