diff options
author | Melanie Plageman <melanieplageman@gmail.com> | 2025-02-24 16:07:55 -0500 |
---|---|---|
committer | Melanie Plageman <melanieplageman@gmail.com> | 2025-02-24 16:10:19 -0500 |
commit | bfe56cdf9a4e07edca46254a88efd9ef17421cd7 (patch) | |
tree | 8873b768c437042a67730df2da4cd3138e0ece00 /src/backend/access/heap/heapam_handler.c | |
parent | b8778c4cd8bc924ce5347cb1ab10dfbf34130559 (diff) | |
download | postgresql-bfe56cdf9a4e07edca46254a88efd9ef17421cd7.tar.gz postgresql-bfe56cdf9a4e07edca46254a88efd9ef17421cd7.zip |
Delay extraction of TIDBitmap per page offsets
Pages from the bitmap created by the TIDBitmap API can be exact or
lossy. The TIDBitmap API extracts the tuple offsets from exact pages
into an array for the convenience of the caller.
This was done in tbm_private|shared_iterate() right after advancing the
iterator. However, as long as tbm_private|shared_iterate() set a
reference to the PagetableEntry in the TBMIterateResult, the offset
extraction can be done later.
Waiting to extract the tuple offsets has a few benefits. For the shared
iterator case, it allows us to extract the offsets after dropping the
shared iterator state lock, reducing time spent holding a contended
lock.
Separating the iteration step and extracting the offsets later also
allows us to avoid extracting the offsets for prefetched blocks. Those
offsets were never used, so the overhead of extracting and storing them
was wasted.
The real motivation for this change, however, is that future commits
will make bitmap heap scan use the read stream API. This requires a
TBMIterateResult per issued block. By removing the array of tuple
offsets from the TBMIterateResult and only extracting the offsets when
they are used, we reduce the memory required for per buffer data
substantially.
Suggested-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
Diffstat (limited to 'src/backend/access/heap/heapam_handler.c')
-rw-r--r-- | src/backend/access/heap/heapam_handler.c | 17 |
1 files changed, 14 insertions, 3 deletions
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index 269d581c2ec..e78682c3cef 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -2127,6 +2127,8 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, Snapshot snapshot; int ntup; TBMIterateResult *tbmres; + OffsetNumber offsets[TBM_MAX_TUPLES_PER_PAGE]; + int noffsets = -1; Assert(scan->rs_flags & SO_TYPE_BITMAPSCAN); @@ -2145,6 +2147,11 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, if (tbmres == NULL) return false; + /* Exact pages need their tuple offsets extracted. */ + if (!tbmres->lossy) + noffsets = tbm_extract_page_tuple(tbmres, offsets, + TBM_MAX_TUPLES_PER_PAGE); + /* * Ignore any claimed entries past what we think is the end of the * relation. It may have been extended after the start of our scan (we @@ -2172,8 +2179,9 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, /* can't be lossy in the skip_fetch case */ Assert(!tbmres->lossy); Assert(bscan->rs_empty_tuples_pending >= 0); + Assert(noffsets > -1); - bscan->rs_empty_tuples_pending += tbmres->ntuples; + bscan->rs_empty_tuples_pending += noffsets; return true; } @@ -2216,9 +2224,12 @@ heapam_scan_bitmap_next_block(TableScanDesc scan, */ int curslot; - for (curslot = 0; curslot < tbmres->ntuples; curslot++) + /* We must have extracted the tuple offsets by now */ + Assert(noffsets > -1); + + for (curslot = 0; curslot < noffsets; curslot++) { - OffsetNumber offnum = tbmres->offsets[curslot]; + OffsetNumber offnum = offsets[curslot]; ItemPointerData tid; HeapTupleData heapTuple; |