diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2024-12-19 17:02:25 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2024-12-19 17:02:25 -0500 |
commit | 8d96f57d5cc79c0c51050bb707c19bf07d2895eb (patch) | |
tree | 6d67ac0e73bee8d2d7fc9c977f3539fbc0430a46 /src/backend/optimizer/util/pathnode.c | |
parent | 27627929528e24a547d1058a5444b35491057a56 (diff) | |
download | postgresql-8d96f57d5cc79c0c51050bb707c19bf07d2895eb.tar.gz postgresql-8d96f57d5cc79c0c51050bb707c19bf07d2895eb.zip |
Improve planner's handling of SetOp plans.
Remove the code for inserting flag columns in the inputs of a SetOp.
That was the only reason why there would be resjunk columns in a
set-operations plan tree, so we can get rid of some code that
supported that, too.
Get rid of choose_hashed_setop() in favor of building Paths for
the hashed and sorted alternatives, and letting them fight it out
within add_path().
Remove set_operation_ordered_results_useful(), which was giving wrong
answers due to examining the wrong ancestor node: we need to examine
the immediate SetOperationStmt parent not the topmost node. Instead
make each caller of recurse_set_operations() pass down the relevant
parent node. (This thinko seems to have led only to wasted planning
cycles and possibly-inferior plans, not wrong query answers. Perhaps
we should back-patch it, but I'm not doing so right now.)
Teach generate_nonunion_paths() to consider pre-sorted inputs for
sorted SetOps, rather than always generating a Sort node.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
Diffstat (limited to 'src/backend/optimizer/util/pathnode.c')
-rw-r--r-- | src/backend/optimizer/util/pathnode.c | 69 |
1 files changed, 61 insertions, 8 deletions
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index e52e4b1d677..4f74cafa259 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -3681,17 +3681,70 @@ create_setop_path(PlannerInfo *root, pathnode->numGroups = numGroups; /* - * Charge one cpu_operator_cost per comparison per input tuple. We assume - * all columns get compared at most of the tuples. - * - * XXX all wrong for hashing + * Compute cost estimates. As things stand, we end up with the same total + * cost in this node for sort and hash methods, but different startup + * costs. This could be refined perhaps, but it'll do for now. */ pathnode->path.disabled_nodes = leftpath->disabled_nodes + rightpath->disabled_nodes; - pathnode->path.startup_cost = - leftpath->startup_cost + rightpath->startup_cost; - pathnode->path.total_cost = leftpath->total_cost + rightpath->total_cost + - cpu_operator_cost * (leftpath->rows + rightpath->rows) * list_length(groupList); + if (strategy == SETOP_SORTED) + { + /* + * In sorted mode, we can emit output incrementally. Charge one + * cpu_operator_cost per comparison per input tuple. Like cost_group, + * we assume all columns get compared at most of the tuples. + */ + pathnode->path.startup_cost = + leftpath->startup_cost + rightpath->startup_cost; + pathnode->path.total_cost = + leftpath->total_cost + rightpath->total_cost + + cpu_operator_cost * (leftpath->rows + rightpath->rows) * list_length(groupList); + + /* + * Also charge a small amount per extracted tuple. Like cost_sort, + * charge only operator cost not cpu_tuple_cost, since SetOp does no + * qual-checking or projection. + */ + pathnode->path.total_cost += cpu_operator_cost * outputRows; + } + else + { + Size hashentrysize; + + /* + * In hashed mode, we must read all the input before we can emit + * anything. Also charge comparison costs to represent the cost of + * hash table lookups. + */ + pathnode->path.startup_cost = + leftpath->total_cost + rightpath->total_cost + + cpu_operator_cost * (leftpath->rows + rightpath->rows) * list_length(groupList); + pathnode->path.total_cost = pathnode->path.startup_cost; + + /* + * Also charge a small amount per extracted tuple. Like cost_sort, + * charge only operator cost not cpu_tuple_cost, since SetOp does no + * qual-checking or projection. + */ + pathnode->path.total_cost += cpu_operator_cost * outputRows; + + /* + * Mark the path as disabled if enable_hashagg is off. While this + * isn't exactly a HashAgg node, it seems close enough to justify + * letting that switch control it. + */ + if (!enable_hashagg) + pathnode->path.disabled_nodes++; + + /* + * Also disable if it doesn't look like the hashtable will fit into + * hash_mem. + */ + hashentrysize = MAXALIGN(leftpath->pathtarget->width) + + MAXALIGN(SizeofMinimalTupleHeader); + if (hashentrysize * numGroups > get_hash_memory_limit()) + pathnode->path.disabled_nodes++; + } pathnode->path.rows = outputRows; return pathnode; |