Improve planner's ability to recognize cases where an IN's RHS is unique.

If the right-hand side of a semijoin is unique, then we can treat it like a normal join (or another way to say that is: we don't need to explicitly unique-ify the data before doing it as a normal join). We were recognizing such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP BY decoration, but there's another way: if the RHS is a plain relation with unique indexes, we can check if any of the indexes prove the output is unique. Most of the infrastructure for that was there already in the join removal code, though I had to rearrange it a bit. Per reflection about a recent example in pgsql-performance.
author: Tom Lane <tgl@sss.pgh.pa.us> 2011-10-26 17:52:02 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2011-10-26 17:52:29 -0400
commit: 3e4b3465b6345b75659e8f897976d4c810408762 (patch)
tree: 926411ab20e56c663b9793568577a2f73bc3bed8 /src/backend/optimizer/util
parent: 360429e1d17947c17659cd57dbaaba5c10fcbd6a (diff)
download: postgresql-3e4b3465b6345b75659e8f897976d4c810408762.tar.gz
postgresql-3e4b3465b6345b75659e8f897976d4c810408762.zip
1 files changed, 25 insertions, 2 deletions
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 6aa34412def..1e7aac95ef4 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1021,8 +1021,8 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->path.parent = rel;
 
 	/*
-	 * Treat the output as always unsorted, since we don't necessarily have
-	 * pathkeys to represent it.
+	 * Assume the output is unsorted, since we don't necessarily have pathkeys
+	 * to represent it.  (This might get overridden below.)
 	 */
 	pathnode->path.pathkeys = NIL;
 
@@ -1031,6 +1031,29 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->uniq_exprs = uniq_exprs;
 
 	/*
+	 * If the input is a relation and it has a unique index that proves the
+	 * uniq_exprs are unique, then we don't need to do anything.  Note that
+	 * relation_has_unique_index_for automatically considers restriction
+	 * clauses for the rel, as well.
+	 */
+	if (rel->rtekind == RTE_RELATION && all_btree &&
+		relation_has_unique_index_for(root, rel, NIL,
+									  uniq_exprs, in_operators))
+	{
+		pathnode->umethod = UNIQUE_PATH_NOOP;
+		pathnode->rows = rel->rows;
+		pathnode->path.startup_cost = subpath->startup_cost;
+		pathnode->path.total_cost = subpath->total_cost;
+		pathnode->path.pathkeys = subpath->pathkeys;
+
+		rel->cheapest_unique_path = (Path *) pathnode;
+
+		MemoryContextSwitchTo(oldcontext);
+
+		return pathnode;
+	}
+
+	/*
 	 * If the input is a subquery whose output must be unique already, then we
 	 * don't need to do anything.  The test for uniqueness has to consider
 	 * exactly which columns we are extracting; for example "SELECT DISTINCT
author	Tom Lane <tgl@sss.pgh.pa.us>	2011-10-26 17:52:02 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2011-10-26 17:52:29 -0400
commit	3e4b3465b6345b75659e8f897976d4c810408762 (patch)
tree	926411ab20e56c663b9793568577a2f73bc3bed8 /src/backend/optimizer/util
parent	360429e1d17947c17659cd57dbaaba5c10fcbd6a (diff)
download	postgresql-3e4b3465b6345b75659e8f897976d4c810408762.tar.gz postgresql-3e4b3465b6345b75659e8f897976d4c810408762.zip