aboutsummaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2011-08-20 14:51:02 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2011-08-20 14:51:52 -0400
commit6016005118295fdcd97d99a0f372f18a28251c48 (patch)
tree25749e847ff0b3b56c09852a7d00e319f4adfeaf /src
parent44631eec3298d25b7b6fa71fc237c076f4e2a3a3 (diff)
downloadpostgresql-6016005118295fdcd97d99a0f372f18a28251c48.tar.gz
postgresql-6016005118295fdcd97d99a0f372f18a28251c48.zip
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can make building a large tidbitmap into an O(N^2) operation. To fix, make sure we remove more than the minimum amount of information per call, and add a fallback path to behave sanely if we're unable to fit the bitmap within the requested amount of memory. This has been wrong since the tidbitmap code was written, so back-patch to all supported branches.
Diffstat (limited to 'src')
-rw-r--r--src/backend/nodes/tidbitmap.c22
1 files changed, 19 insertions, 3 deletions
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index e901659d583..7a79f64d2fc 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -884,8 +884,11 @@ tbm_lossify(TIDBitmap *tbm)
/*
* XXX Really stupid implementation: this just lossifies pages in
* essentially random order. We should be paying some attention to the
- * number of bits set in each page, instead. Also it might be a good idea
- * to lossify more than the minimum number of pages during each call.
+ * number of bits set in each page, instead.
+ *
+ * Since we are called as soon as nentries exceeds maxentries, we should
+ * push nentries down to significantly less than maxentries, or else we'll
+ * just end up doing this again very soon. We shoot for maxentries/2.
*/
Assert(!tbm->iterating);
Assert(tbm->status == TBM_HASH);
@@ -906,7 +909,7 @@ tbm_lossify(TIDBitmap *tbm)
/* This does the dirty work ... */
tbm_mark_page_lossy(tbm, page->blockno);
- if (tbm->nentries <= tbm->maxentries)
+ if (tbm->nentries <= tbm->maxentries / 2)
{
/* we have done enough */
hash_seq_term(&status);
@@ -919,6 +922,19 @@ tbm_lossify(TIDBitmap *tbm)
* not care whether we visit lossy chunks or not.
*/
}
+
+ /*
+ * With a big bitmap and small work_mem, it's possible that we cannot
+ * get under maxentries. Again, if that happens, we'd end up uselessly
+ * calling tbm_lossify over and over. To prevent this from becoming a
+ * performance sink, force maxentries up to at least double the current
+ * number of entries. (In essence, we're admitting inability to fit
+ * within work_mem when we do this.) Note that this test will not fire
+ * if we broke out of the loop early; and if we didn't, the current
+ * number of entries is simply not reducible any further.
+ */
+ if (tbm->nentries > tbm->maxentries / 2)
+ tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;
}
/*