Fix some problems with selectivity estimation for partial indexes.

First, genericcostestimate() was being way too liberal about including partial-index conditions in its selectivity estimate, resulting in substantial underestimates for situations such as an indexqual "x = 42" used with an index on x "WHERE x >= 40 AND x < 50". While the code is intentionally set up to favor selecting partial indexes when available, this was too much... Second, choose_bitmap_and() was likewise easily fooled by cases of this type, since it would similarly think that the partial index had selectivity independent of the indexqual. Fixed by using predicate_implied_by() rather than simple equality checks to determine redundancy. This is a good deal more expensive but I don't see much alternative. At least the extra cost is only paid when there's actually a partial index under consideration. Per report from Jeff Davis. I'm not going to risk back-patching this, though.
author: Tom Lane <tgl@sss.pgh.pa.us> 2007-03-21 22:18:12 +0000
committer: Tom Lane <tgl@sss.pgh.pa.us> 2007-03-21 22:18:12 +0000
commit: 54d20024c1dad339acc8624d7ca902b762fe0844 (patch)
tree: 107270059759b4e114b661838d896da4f4df8ada /src/backend/utils/adt/selfuncs.c
parent: 2b49e5d3cb8b6a71797969c50f5c247f232de989 (diff)
download: postgresql-54d20024c1dad339acc8624d7ca902b762fe0844.tar.gz
postgresql-54d20024c1dad339acc8624d7ca902b762fe0844.zip
1 files changed, 27 insertions, 24 deletions
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index df61fea567b..a92317aeac1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -15,7 +15,7 @@
  *
  *
  * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.229 2007/03/17 00:11:05 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.230 2007/03/21 22:18:12 tgl Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -86,6 +86,7 @@
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
+#include "optimizer/predtest.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/var.h"
 #include "parser/parse_coerce.h"
@@ -4775,36 +4776,38 @@ genericcostestimate(PlannerInfo *root,
 	List	   *selectivityQuals;
 	ListCell   *l;
 
-	/*
+	/*----------
 	 * If the index is partial, AND the index predicate with the explicitly
 	 * given indexquals to produce a more accurate idea of the index
-	 * selectivity.  This may produce redundant clauses.  We get rid of exact
-	 * duplicates in the code below.  We expect that most cases of partial
-	 * redundancy (such as "x < 4" from the qual and "x < 5" from the
-	 * predicate) will be recognized and handled correctly by
-	 * clauselist_selectivity().  This assumption is somewhat fragile, since
-	 * it depends on predicate_implied_by() and clauselist_selectivity()
-	 * having similar capabilities, and there are certainly many cases where
-	 * we will end up with a too-low selectivity estimate.	This will bias the
-	 * system in favor of using partial indexes where possible, which is not
-	 * necessarily a bad thing. But it'd be nice to do better someday.
+	 * selectivity.  However, we need to be careful not to insert redundant
+	 * clauses, because clauselist_selectivity() is easily fooled into
+	 * computing a too-low selectivity estimate.  Our approach is to add
+	 * only the index predicate clause(s) that cannot be proven to be implied
+	 * by the given indexquals.  This successfully handles cases such as a
+	 * qual "x = 42" used with a partial index "WHERE x >= 40 AND x < 50".
+	 * There are many other cases where we won't detect redundancy, leading
+	 * to a too-low selectivity estimate, which will bias the system in favor
+	 * of using partial indexes where possible.  That is not necessarily bad
+	 * though.
 	 *
-	 * Note that index->indpred and indexQuals are both in implicit-AND form,
-	 * so ANDing them together just takes merging the lists.  However,
-	 * eliminating duplicates is a bit trickier because indexQuals contains
-	 * RestrictInfo nodes and the indpred does not.  It is okay to pass a
-	 * mixed list to clauselist_selectivity, but we have to work a bit to
-	 * generate a list without logical duplicates.	(We could just list_union
-	 * indpred and strippedQuals, but then we'd not get caching of per-qual
-	 * selectivity estimates.)
+	 * Note that indexQuals contains RestrictInfo nodes while the indpred
+	 * does not.  This is OK for both predicate_implied_by() and
+	 * clauselist_selectivity().
+	 *----------
 	 */
 	if (index->indpred != NIL)
 	{
-		List	   *strippedQuals;
-		List	   *predExtraQuals;
+		List	   *predExtraQuals = NIL;
+
+		foreach(l, index->indpred)
+		{
+			Node   *predQual = (Node *) lfirst(l);
+			List   *oneQual = list_make1(predQual);
 
-		strippedQuals = get_actual_clauses(indexQuals);
-		predExtraQuals = list_difference(index->indpred, strippedQuals);
+			if (!predicate_implied_by(oneQual, indexQuals))
+				predExtraQuals = list_concat(predExtraQuals, oneQual);
+		}
+		/* list_concat avoids modifying the passed-in indexQuals list */
 		selectivityQuals = list_concat(predExtraQuals, indexQuals);
 	}
 	else
author	Tom Lane <tgl@sss.pgh.pa.us>	2007-03-21 22:18:12 +0000
committer	Tom Lane <tgl@sss.pgh.pa.us>	2007-03-21 22:18:12 +0000
commit	54d20024c1dad339acc8624d7ca902b762fe0844 (patch)
tree	107270059759b4e114b661838d896da4f4df8ada /src/backend/utils/adt/selfuncs.c
parent	2b49e5d3cb8b6a71797969c50f5c247f232de989 (diff)
download	postgresql-54d20024c1dad339acc8624d7ca902b762fe0844.tar.gz postgresql-54d20024c1dad339acc8624d7ca902b762fe0844.zip