aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/util/pathnode.c
Commit message (Collapse)AuthorAge
* Track the number of presorted outer pathkeys in MergePathRichard Guo2025-05-08
| | | | | | | | | | | | | | | | When creating an explicit Sort node for the outer path of a mergejoin, we need to determine the number of presorted keys of the outer path to decide whether explicit incremental sort can be applied. Currently, this is done by repeatedly calling pathkeys_count_contained_in. This patch caches the number of presorted outer pathkeys in MergePath, allowing us to save several calls to pathkeys_count_contained_in. It can be considered a complement to the changes in commit 828e94c9d. Reported-by: David Rowley <dgrowleyml@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Improve planner's handling of SetOp plans.Tom Lane2024-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove the code for inserting flag columns in the inputs of a SetOp. That was the only reason why there would be resjunk columns in a set-operations plan tree, so we can get rid of some code that supported that, too. Get rid of choose_hashed_setop() in favor of building Paths for the hashed and sorted alternatives, and letting them fight it out within add_path(). Remove set_operation_ordered_results_useful(), which was giving wrong answers due to examining the wrong ancestor node: we need to examine the immediate SetOperationStmt parent not the topmost node. Instead make each caller of recurse_set_operations() pass down the relevant parent node. (This thinko seems to have led only to wasted planning cycles and possibly-inferior plans, not wrong query answers. Perhaps we should back-patch it, but I'm not doing so right now.) Teach generate_nonunion_paths() to consider pre-sorted inputs for sorted SetOps, rather than always generating a Sort node. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
* Convert SetOp to read its inputs as outerPlan and innerPlan.Tom Lane2024-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original design for set operations involved appending the two input relations into one and adding a flag column that allows distinguishing which side each row came from. Then the SetOp node pries them apart again based on the flag. This is bizarre. The only apparent reason to do it is that when sorting, we'd only need one Sort node not two. But since sorting is at least O(N log N), sorting all the data is actually worse than sorting each side separately --- plus, we have no chance of taking advantage of presorted input. On top of that, adding the flag column frequently requires an additional projection step that adds cycles, and then the Append node isn't free either. Let's get rid of all of that and make the SetOp node have two separate children, using the existing outerPlan/innerPlan infrastructure. This initial patch re-implements nodeSetop.c and does a bare minimum of work on the planner side to generate correctly-shaped plans. In particular, I've tried not to change the cost estimates here, so that the visible changes in the regression test results will only involve removal of useless projection steps and not any changes in whether to use sorted vs hashed mode. For SORTED mode, we combine successive identical tuples from each input into groups, and then merge-join the groups. The tuple comparisons now use SortSupport instead of simple equality, but the group-formation part should involve roughly the same number of tuple comparisons as before. The cross-comparisons between left and right groups probably add to that, but I'm not sure to quantify how many more comparisons we might need. For HASHED mode, nodeSetop's logic is almost the same as before, just refactored into two separate loops instead of one loop that has an assumption that it will see all the left-hand inputs first. In both modes, I added early-exit logic to not bother reading the right-hand relation if the left-hand input is empty, since neither INTERSECT nor EXCEPT modes can produce any output if the left input is empty. This could have been done before in the hashed mode, but not in sorted mode. Sorted mode can also stop as soon as it exhausts the left input; any remaining right-hand tuples cannot have matches. Also, this patch adds some infrastructure for detecting whether child plan nodes all output the same type of tuple table slot. If they do, the hash table logic can use slightly more efficient code based on assuming that that's the input slot type it will see. We'll make use of that infrastructure in other plan node types later. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
* Treat number of disabled nodes in a path as a separate cost metric.Robert Haas2024-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, when a path type was disabled by e.g. enable_seqscan=false, we either avoided generating that path type in the first place, or more commonly, we added a large constant, called disable_cost, to the estimated startup cost of that path. This latter approach can distort planning. For instance, an extremely expensive non-disabled path could seem to be worse than a disabled path, especially if the full cost of that path node need not be paid (e.g. due to a Limit). Or, as in the regression test whose expected output changes with this commit, the addition of disable_cost can make two paths that would normally be distinguishible in cost seem to have fuzzily the same cost. To fix that, we now count the number of disabled path nodes and consider that a high-order component of both the startup cost and the total cost. Hence, the path list is now sorted by disabled_nodes and then by total_cost, instead of just by the latter, and likewise for the partial path list. It is important that this number is a count and not simply a Boolean; else, as soon as we're unable to respect disabled path types in all portions of the path, we stop trying to avoid them where we can. Because the path list is now sorted by the number of disabled nodes, the join prechecks must compute the count of disabled nodes during the initial cost phase instead of postponing it to final cost time. Counts of disabled nodes do not cross subquery levels; at present, there is no reason for them to do so, since the we do not postpone path selection across subquery boundaries (see make_subplan). Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley. Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
* Simplify create_merge_append_path for clarityRichard Guo2024-07-29
| | | | | | | | | | | | | | | We don't currently support parameterized MergeAppend paths: there's little use for an ordered path on the inside of a nestloop. Given this, we can simplify create_merge_append_path by directly setting param_info to NULL instead of calling get_appendrel_parampathinfo. We can also simplify the Assert for child paths a little bit. This change won't make any measurable difference in performance; it's just for clarity's sake. Author: Richard Guo Reviewed-by: Alena Rybakina, Paul A Jungwirth Discussion: https://postgr.es/m/CAMbWs4_n1bgH2nACMuGsXZct3KH6PBFS0tPdQsXdstRfyxTunQ@mail.gmail.com
* Remove redundant code in create_gather_merge_pathRichard Guo2024-07-23
| | | | | | | | | | | | | | | | In create_gather_merge_path, we should always guarantee that the subpath is adequately ordered, and we do not add a Sort node in createplan.c for a Gather Merge node. Therefore, the 'else' branch in create_gather_merge_path, which computes the cost for a Sort node, is redundant. This patch removes the redundant code and emits an error if the subpath is not sufficiently ordered. Meanwhile, this patch changes the check for the subpath's pathkeys in create_gather_merge_plan to an Assert. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs48u=0bWf3epVtULjJ-=M9Hbkz+ieZQAOS=BfbXZFqbDCg@mail.gmail.com
* Fix rowcount estimate for gather (merge) pathsRichard Guo2024-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of a parallel plan, when computing the number of tuples processed per worker, we divide the total number of tuples by the parallel_divisor obtained from get_parallel_divisor(), which accounts for the leader's contribution in addition to the number of workers. Accordingly, when estimating the number of tuples for gather (merge) nodes, we should multiply the number of tuples per worker by the same parallel_divisor to reverse the division. However, currently we use parallel_workers rather than parallel_divisor for the multiplication. This could result in an underestimation of the number of tuples for gather (merge) nodes, especially when there are fewer than four workers. This patch fixes this issue by using the same parallel_divisor for the multiplication. There is one ensuing plan change in the regression tests, but it looks reasonable and does not compromise its original purpose of testing parallel-aware hash join. In passing, this patch removes an unnecessary assignment for path.rows in create_gather_merge_path, and fixes an uninitialized-variable issue in generate_useful_gather_paths. No backpatch as this could result in plan changes. Author: Anthonin Bonnefoy Reviewed-by: Rafia Sabih, Richard Guo Discussion: https://postgr.es/m/CAO6_Xqr9+51NxgO=XospEkUeAg-p=EjAWmtpdcZwjRgGKJ53iA@mail.gmail.com
* Fix possible Assert failure in cost_memoize_rescanDavid Rowley2024-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | In cost_memoize_rescan(), when calculating the hit_ratio using the calls and ndistinct estimations, if the value that was set in MemoizePath.calls had not been processed through clamp_row_est(), then it was possible that it was set to some non-integer value which could result in ndistinct being 1 higher than calls due to estimate_num_groups() performing clamp_row_est() on its input_rows. This could result in hit_ratio values slightly below 0.0, which would cause an Assert failure. The value of MemoizePath.calls comes from the final parameter in the create_memoize_path() function, of which we only have one true caller of. That caller passes outer_path->rows. All the core code I looked at always seems to call clamp_row_est() on the Path.rows, so there might have been no issues with any core Paths causing troubles here. The bug report was about a CustomPath with a non-clamped row estimated. The misbehavior as a result of this seems to be mostly limited to the Assert() failing. Aside from that, it seems the Memoize costs would just come out slightly higher than they should have, which is likely fairly harmless. Reported-by: Kohei KaiGai <kaigai@heterodb.com> Discussion: https://postgr.es/m/CAOP8fzZnTU+N64UYJYogb1hN-5hFP+PwTb3m_cnGAD7EsQwrKw@mail.gmail.com Reviewed-by: Richard Guo Backpatch-through: 14, where Memoize was introduced
* Fix query pullup issue with WindowClause runConditionDavid Rowley2024-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 94985c210 added code to detect when WindowFuncs were monotonic and allowed additional quals to be "pushed down" into the subquery to be used as WindowClause runConditions in order to short-circuit execution in nodeWindowAgg.c. The Node representation of runConditions wasn't well selected and because we do qual pushdown before planning the subquery, the planning of the subquery could perform subquery pull-up of nested subqueries. For WindowFuncs with args, the arguments could be changed after pushing the qual down to the subquery. This was made more difficult by the fact that the code duplicated the WindowFunc inside an OpExpr to include in the WindowClauses runCondition field. This could result in duplication of subqueries and a pull-up of such a subquery could result in another initplan parameter being issued for the 2nd version of the subplan. This could result in errors such as: ERROR: WindowFunc not found in subplan target lists To fix this, we change the node representation of these run conditions and instead of storing an OpExpr containing the WindowFunc in a list inside WindowClause, we now store a new node type named WindowFuncRunCondition within a new field in the WindowFunc. These get transformed into OpExprs later in planning once subquery pull-up has been performed. This problem did exist in v15 and v16, but that was fixed by 9d36b883b and e5d20bbd. Cat version bump due to new node type and modifying WindowFunc struct. Bug: #18305 Reported-by: Zuming Jiang Discussion: https://postgr.es/m/18305-33c49b4c830b37b3%40postgresql.org
* Add support for MERGE ... WHEN NOT MATCHED BY SOURCE.Dean Rasheed2024-03-30
| | | | | | | | | | | | | | | | | | | This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE actions, which operate on rows that exist in the target relation, but not in the data source. These actions can execute UPDATE, DELETE, or DO NOTHING sub-commands. This is in contrast to already-supported WHEN NOT MATCHED actions, which operate on rows that exist in the data source, but not in the target relation. To make this distinction clearer, such actions may now be written as WHEN NOT MATCHED BY TARGET. Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is equivalent to writing WHEN NOT MATCHED BY TARGET. Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing. Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com
* Propagate pathkeys from CTEs up to the outer query.Tom Lane2024-03-26
| | | | | | | | | | | | | | | | | | | | | | | If we know the sort order of a CTE's output, and it is relevant to the outer query, label the CTE's outer-query access path using those pathkeys. This may enable optimizations such as avoiding a sort in the outer query. The code for hoisting pathkeys into the outer query already exists for regular RTE_SUBQUERY subqueries, but it wasn't getting used for CTEs, possibly out of concern for maintaining an optimization fence between the CTE and the outer query. However, on the same arguments used for commit f7816aec2, there seems no harm in letting the outer query know what the inner query decided to do. In support of this, we now remember the best Path as well as Plan for each subquery for the rest of the planner run. There may be future applications for having that at hand, and it surely costs little to build one more List. Richard Guo (minor mods by me) Discussion: https://postgr.es/m/CAMbWs49xYd3f8CrE8-WW3--dV1zH_sDSDn-vs2DzHj81Wcnsew@mail.gmail.com
* Reduce memory used by partitionwise joinsAmit Langote2024-03-25
| | | | | | | | | | | | | | | | | | | | | Specifically, this commit reduces the memory consumed by the SpecialJoinInfos that are allocated for child joins in try_partitionwise_join() by freeing them at the end of creating paths for each child join. A SpecialJoinInfo allocated for a given child join is a copy of the parent join's SpecialJoinInfo, which contains the translated copies of the various Relids bitmapsets and semi_rhs_exprs, which is a List of Nodes. The newly added freeing step frees the struct itself and the various bitmapsets, but not semi_rhs_exprs, because there's no handy function to free the memory of Node trees. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CAExHW5tHqEf3ASVqvFFcghYGPfpy7o3xnvhHwBGbJFMRH8KjNw@mail.gmail.com
* Postpone reparameterization of paths until create_plan().Tom Lane2024-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When considering nestloop paths for individual partitions within a partitionwise join, if the inner path is parameterized, it is parameterized by the topmost parent of the outer rel, not the corresponding outer rel itself. Therefore, we need to translate the parameterization so that the inner path is parameterized by the corresponding outer rel. Up to now, we did this while generating join paths. However, that's problematic because we must also translate some expressions that are shared across all paths for a relation, such as restriction clauses (kept in the RelOptInfo and/or IndexOptInfo) and TableSampleClauses (kept in the RangeTblEntry). The existing code fails to translate these at all, leading to wrong answers, odd failures such as "variable not found in subplan target list", or executor crashes. But we can't modify them during path generation, because that would break things if we end up choosing some non-partitioned-join path. So this patch postpones reparameterization of the inner path until createplan.c, where it is safe to modify the referenced RangeTblEntry, RelOptInfo or IndexOptInfo, because we have made a final choice of which Path to use. We do still have to check during path generation that the reparameterization will be possible. So we introduce a new function path_is_reparameterizable_by_child() to detect that. The duplication between path_is_reparameterizable_by_child() and reparameterize_path_by_child() is a bit annoying, but there seems no other good answer. A small benefit is that we can avoid building useless reparameterized trees in cases where a non-partitioned join is ultimately chosen. Also, reparameterize_path_by_child() can now be allowed to scribble on the input paths, saving a few cycles. This fix repairs the same problems previously addressed in the back branches by commits 62f120203 et al. Richard Guo, reviewed at various times by Ashutosh Bapat, Andrei Lepikhov, Alena Rybakina, Robert Haas, and myself Discussion: https://postgr.es/m/CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com
* Remove unused #include's from backend .c filesPeter Eisentraut2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
* Avoid dangling-pointer problem with partitionwise joins under GEQO.Tom Lane2024-02-23
| | | | | | | | | | | | | | | | build_child_join_sjinfo creates a derived SpecialJoinInfo in the short-lived GEQO context, but afterwards the semi_rhs_exprs from that may be used in a UniquePath for a child base relation. This breaks the expectation that all base-relation-level structures are in the planning-lifespan context, leading to use of a dangling pointer with probable ensuing crash later on in create_unique_plan. To fix, copy the expression trees when making a UniquePath. Per bug #18360 from Alexander Lakhin. This has been broken since partitionwise joins were added, so back-patch to all supported branches. Discussion: https://postgr.es/m/18360-a23caf3157f34e62@postgresql.org
* Clarify the 'rows' parameter in create_append_pathDavid Rowley2024-02-15
| | | | | | | | | | This is extracted from a larger patch to improve the UNION planner. While working on that, I found myself having to check what the 'rows' parameter is for. It's not obvious that passing a negative number is the way to have the rows estimate calculated and to find that out you need to read code in create_append_path() and in cost_append(). Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
* Fix costing bug in MergeAppendDavid Rowley2024-02-01
| | | | | | | | | | | | | | | | | | | | | | | | When building a MergeAppendPath which has child paths that are not sorted correctly for the MergeAppend's sort order, we apply the cost of sorting those paths to the MergeAppendPath costs. Here we fix a bug where the number of tuples specified that needed to be sorted was effectively pg_class.reltuples rather than the number of expected row in the subpath. This effectively penalizes MergeAppend plans any time any filter is present on the MergeAppend subpath as the sort cost added is to sort all tuples in the table rather than just the ones expected the path to return. This did not affect UNION ALL type queries as the RelOptInfo tuples is set from the subquery's path rows. It does affect MergeAppends uses for inheritance and partitioned tables. This is a long-standing bug introduced when MergeAppend was first added in 11cad29c9. No backpatch as this could result in plan changes. Author: Alexander Kuzmenkov Reviewed-by: Ashutosh Bapat, Aleksander Alekseev, David Rowley Discussion: https://postgr.es/m/CALzhyqyhoXQDR-Usd_0HeWk%3DuqNLzoVeT8KhRoo%3DpV_KzgO3QQ%40mail.gmail.com
* Fix Asserts in calc_non_nestloop_required_outer().Tom Lane2024-01-10
| | | | | | | | | | | | | | These were not testing the same thing as the comparable Assert in calc_nestloop_required_outer(), because we neglected to map the given Paths' relids to top-level relids. When considering a partition child join the latter is the correct thing to do. This oversight is old, but since it's only an overly-weak Assert check there doesn't seem to be much value in back-patching. Richard Guo (with cosmetic changes and comment updates by me) Discussion: https://postgr.es/m/CAMbWs49sqbe9GBZ8sy8dSfKRNURgicR85HX8vgzcgQsPF0XY1w@mail.gmail.com
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Fix problems when a plain-inheritance parent table is excluded.Tom Lane2023-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an UPDATE/DELETE/MERGE's target table is an old-style inheritance tree, it's possible for the parent to get excluded from the plan while some children are not. (I believe this is only possible if we can prove that a CHECK ... NO INHERIT constraint on the parent contradicts the query WHERE clause, so it's a very unusual case.) In such a case, ExecInitModifyTable mistakenly concluded that the first surviving child is the target table, leading to at least two bugs: 1. The wrong table's statement-level triggers would get fired. 2. In v16 and up, it was possible to fail with "invalid perminfoindex 0 in RTE with relid nnnn" due to the child RTE not having permissions data included in the query plan. This was hard to reproduce reliably because it did not occur unless the update triggered some non-HOT index updates. In v14 and up, this is easy to fix by defining ModifyTable.rootRelation to be the parent RTE in plain inheritance as well as partitioned cases. While the wrong-triggers bug also appears in older branches, the relevant code in both the planner and executor is quite a bit different, so it would take a good deal of effort to develop and test a suitable patch. Given the lack of field complaints about the trigger issue, I'll desist for now. (Patching v11 for this seems unwise anyway, given that it will have no more releases after next month.) Per bug #18147 from Hans Buschmann. Amit Langote and Tom Lane Discussion: https://postgr.es/m/18147-6fc796538913ee88@postgresql.org
* Strip off ORDER BY/DISTINCT aggregate pathkeys in create_agg_pathDavid Rowley2023-10-09
| | | | | | | | | | | | | | | | | | | | | | | 1349d2790 added code to adjust the PlannerInfo.group_pathkeys so that ORDER BY / DISTINCT aggregate functions could obtain pre-sorted inputs to allow faster execution. That commit forgot to adjust the pathkeys in create_agg_path(). Some code in there assumed that it was always fine to make the AggPath's pathkeys the same as its subpath's. That seems to have been ok up until 1349d2790, but since that commit adds pathkeys for columns which are within the aggregate function, those columns won't be available above the aggregate node. This can result in "could not find pathkey item to sort" during create_plan(). The fix here is to strip off the additional pathkeys added by adjust_group_pathkeys_for_groupagg(). It seems that the pathkeys here will only ever be group_pathkeys, so all we need to do is check if the length of the pathkey list is longer than the num_groupby_pathkeys and get rid of the additional ones only if we see extras. Reported-by: Justin Pryzby Reviewed-by: Richard Guo Discussion: https://postgr.es/m/ZQhYYRhUxpW3PSf9%40telsasoft.com Backpatch-through: 16, where 1349d2790 was introduced
* Re-allow FDWs and custom scan providers to replace joins with pseudoconstant ↵Etsuro Fujita2023-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | quals. This was disabled in commit 6f80a8d9c due to the lack of support for handling of pseudoconstant quals assigned to replaced joins in createplan.c. To re-allow it, this patch adds the support by 1) modifying the ForeignPath and CustomPath structs so that if they represent foreign and custom scans replacing a join with a scan, they store the list of RestrictInfo nodes to apply to the join, as in JoinPaths, and by 2) modifying create_scan_plan() in createplan.c so that it uses that list in that case, instead of the baserestrictinfo list, to get pseudoconstant quals assigned to the join, as mentioned in the commit message for that commit. Important item for the release notes: this is non-backwards-compatible since it modifies the ForeignPath and CustomPath structs, as mentioned above, and changes the argument lists for FDW helper functions create_foreignscan_path(), create_foreign_join_path(), and create_foreign_upper_path(). Richard Guo, with some additional changes by me, reviewed by Nishant Sharma, Suraj Kharage, and Richard Guo. Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
* Account for startup rows when costing WindowAggsDavid Rowley2023-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we adjust the costs for WindowAggs so that they properly take into account how much of their subnode they must read before outputting the first row. Without this, we always assumed that the startup cost for the WindowAgg was not much more expensive than the startup cost of its subnode, however, that's going to be completely wrong in many cases. The WindowAgg may have to read *all* of its subnode to output a single row with certain window bound options. Here we estimate how many rows we'll need to read from the WindowAgg's subnode and proportionally add more of the subnode's run costs onto the WindowAgg's startup costs according to how much of it we expect to have to read in order to produce the first WindowAgg row. The reason this is more important than we might have initially thought is that we may end up making use of a path from the lower planner that works well as a cheap startup plan when the query has a LIMIT clause, however, the WindowAgg might mean we need to read far more rows than what the LIMIT specifies. No backpatch on this so as not to cause plan changes in released versions. Bug: #17862 Reported-by: Tim Palmer Author: David Rowley Reviewed-by: Andy Fan Discussion: https://postgr.es/m/17862-1ab8f74b0f7b0611@postgresql.org Discussion: https://postgr.es/m/CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com
* Allow plan nodes with initPlans to be considered parallel-safe.Tom Lane2023-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the plan itself is parallel-safe, and the initPlans are too, there's no reason anymore to prevent the plan from being marked parallel-safe. That restriction (dating to commit ab77a5a45) was really a special case of the fact that we couldn't transmit subplans to parallel workers at all. We fixed that in commit 5e6d8d2bb and follow-ons, but this case never got addressed. We still forbid attaching initPlans to a Gather node that's inserted pursuant to debug_parallel_query = regress. That's because, when we hide the Gather from EXPLAIN output, we'd hide the initPlans too, causing cosmetic regression diffs. It seems inadvisable to kluge EXPLAIN to the extent required to make the output look the same, so just don't do it in that case. Along the way, this also takes care of some sloppiness about updating path costs to match when we move initplans from one place to another during createplan.c and setrefs.c. Since all the planning decisions are already made by that point, this is just cosmetic; but it seems good to keep EXPLAIN output consistent with where the initplans are. The diff in query_planner() might be worth remarking on. I found that one because after fixing things to allow parallel-safe initplans, one partition_prune test case changed plans (as shown in the patch) --- but only when debug_parallel_query was active. The reason proved to be that we only bothered to mark Result nodes as potentially parallel-safe when debug_parallel_query is on. This neglects the fact that parallel-safety may be of interest for a sub-query even though the Result itself doesn't parallelize. Discussion: https://postgr.es/m/1129530.1681317832@sss.pgh.pa.us
* Fix parallel-safety marking when moving initplans to another node.Tom Lane2023-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our policy since commit ab77a5a45 has been that a plan node having any initplans is automatically not parallel-safe. (This could be relaxed, but not today.) clean_up_removed_plan_level neglected this, and could attach initplans to a parallel-safe child plan node without clearing the plan's parallel-safe flag. That could lead to "subplan was not initialized" errors at runtime, in case an initplan referenced another one and only the referencing one got transmitted to parallel workers. The fix in clean_up_removed_plan_level is trivial enough. materialize_finished_plan also moves initplans from one node to another, but it's okay because it already copies the source node's parallel_safe flag. The other place that does this kind of thing is standard_planner's hack to inject a top-level Gather when debug_parallel_query is active. But that's actually dead code given that we're correctly enforcing the "initplans aren't parallel safe" rule, so just replace it with an Assert that there are no initplans. Also improve some related comments. Normally we'd add a regression test case for this sort of bug. The mistake itself is already reached by existing tests, but there is accidentally no visible problem. The only known test case that creates an actual failure seems too indirect and fragile to justify keeping it as a regression test (not least because it fails to fail in v11, though the bug is clearly present there too). Per report from Justin Pryzby. Back-patch to all supported branches. Discussion: https://postgr.es/m/ZDVt6MaNWkRDO1LQ@telsasoft.com
* Enable use of Memoize atop an Append that came from UNION ALL.Tom Lane2023-03-16
| | | | | | | | | | | | | | | | | create_append_path() would only apply get_baserel_parampathinfo when the path is for a partitioned table, but it's also potentially useful for paths for UNION ALL appendrels. Specifically, that supports building a Memoize path atop this one. While we're in the vicinity, delete some dead code in create_merge_append_plan(): there's no need for it to support parameterized MergeAppend paths, and it doesn't look like that is going to change anytime soon. It'll be easy enough to undo this when/if it becomes useful. Richard Guo Discussion: https://postgr.es/m/CAMbWs4_ABSu4PWG2rE1q10tJugEXHWgru3U8dAgkoFvgrb6aEA@mail.gmail.com
* Remove local optimizations of empty Bitmapsets into null pointers.Tom Lane2023-03-02
| | | | | | | | These are all dead code now that it's done centrally. Patch by me; thanks to Nathan Bossart and Richard Guo for review. Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
* Remove dead NoMovementScanDirection codeDavid Rowley2023-02-01
| | | | | | | | | | | | | | | | | | | | | | Here remove some dead code from heapgettup() and heapgettup_pagemode() which was trying to support NoMovementScanDirection scans. This code can never be reached as standard_ExecutorRun() never calls ExecutePlan with NoMovementScanDirection. Additionally, plans which were scanning an unordered index would use NoMovementScanDirection rather than ForwardScanDirection. There was no real need for this, so here we adjust this so we use ForwardScanDirection for unordered index scans. A comment in pathnodes.h claimed that NoMovementScanDirection was used for PathKey reasons, but if that was true, it no longer is, per code in build_index_paths(). This does change the non-text format of the EXPLAIN output so that unordered index scans now have a "Forward" scan direction rather than "NoMovement". The text format of EXPLAIN has not changed. Author: Melanie Plageman Reviewed-by: Tom Lane, David Rowley Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
* Make Vars be outer-join-aware.Tom Lane2023-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Fix Memoize to work with partitionwise joining.Tom Lane2022-12-05
| | | | | | | | | | | | A couple of places weren't up to speed for this. By sheer good luck, we didn't fail but just selected a non-memoized join plan, at least in the test case we have. Nonetheless, it's a bug, and I'm not quite sure that it couldn't have worse consequences in other examples. So back-patch to v14 where Memoize came in. Richard Guo Discussion: https://postgr.es/m/CAMbWs48GkNom272sfp0-WeD6_0HSR19BJ4H1c9ZKSfbVnJsvRg@mail.gmail.com
* Fix broken MemoizePath support in reparameterize_path().Tom Lane2022-12-04
| | | | | | | | | | | | | | It neglected to recurse to the subpath, meaning you'd get back a path identical to the input. This could produce wrong query results if the omission meant that the subpath fails to enforce some join clause it should be enforcing. We don't have a test case for this at the moment, but the code is obviously broken and the fix is equally obvious. Back-patch to v14 where Memoize was introduced. Richard Guo Discussion: https://postgr.es/m/CAMbWs4_R=ORpz=Lkn2q3ebPC5EuWyfZF+tmfCPVLBVK5W39mHA@mail.gmail.com
* Add missing MaterialPath support in reparameterize_path[_by_child].Tom Lane2022-12-04
| | | | | | | | | | These two functions failed to cover MaterialPath. That's not a fatal problem, but we can generate better plans in some cases if we support it. Tom Lane and Richard Guo Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us
* Update some comments that should've covered MERGEAlvaro Herrera2022-10-24
| | | | | | | Oversight in 7103ebb7aae8. Backpatch to 15. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs48gnDjZXq3-b56dVpQCNUJ5hD9kdtWN4QFwKCEapspNsA@mail.gmail.com
* Revert "Optimize order of GROUP BY keys".Tom Lane2022-10-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and several follow-on fixes. The idea of making a cost-based choice of the order of the sorting columns is not fundamentally unsound, but it requires cost information and data statistics that we don't really have. For example, relying on procost to distinguish the relative costs of different sort comparators is pretty pointless so long as most such comparator functions are labeled with cost 1.0. Moreover, estimating the number of comparisons done by Quicksort requires more than just an estimate of the number of distinct values in the input: you also need some idea of the sizes of the larger groups, if you want an estimate that's good to better than a factor of three or so. That's data that's often unknown or not very reliable. Worse, to arrive at estimates of the number of calls made to the lower-order-column comparison functions, the code needs to make estimates of the numbers of distinct values of multiple columns, which are necessarily even less trustworthy than per-column stats. Even if all the inputs are perfectly reliable, the cost algorithm as-implemented cannot offer useful information about how to order sorting columns beyond the point at which the average group size is estimated to drop to 1. Close inspection of the code added by db0d67db2 shows that there are also multiple small bugs. These could have been fixed, but there's not much point if we don't trust the estimates to be accurate in-principle. Finally, the changes in cost_sort's behavior made for very large changes (often a factor of 2 or so) in the cost estimates for all sorting operations, not only those for multi-column GROUP BY. That naturally changes plan choices in many situations, and there's precious little evidence to show that the changes are for the better. Given the above doubts about whether the new estimates are really trustworthy, it's hard to summon much confidence that these changes are better on the average. Since we're hard up against the release deadline for v15, let's revert these changes for now. We can always try again later. Note: in v15, I left T_PathKeyInfo in place in nodes.h even though it's unreferenced. Removing it would be an ABI break, and it seems a bit late in the release cycle for that. Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Improve performance of adjust_appendrel_attrs_multilevel.Tom Lane2022-08-18
| | | | | | | | | | | | | | | | | | | | | | | | The present implementations of adjust_appendrel_attrs_multilevel and its sibling adjust_child_relids_multilevel are very messy, because they work by reconstructing the relids of the child's immediate parent and then seeing if that's bms_equal to the relids of the target parent. Aside from being quite inefficient, this will not work with planned future changes to make joinrels' relid sets contain outer-join relids in addition to baserels. The whole thing can be solved at a stroke by adding explicit parent and top_parent links to child RelOptInfos, and making these functions work with RelOptInfo pointers instead of relids. Doing that is simpler for most callers, too. In my original version of this patch, I got rid of RelOptInfo.top_parent_relids on the grounds that it was now redundant. However, that adds a lot of code churn in places that otherwise would not need changing, and arguably the extra indirection needed to fetch top_parent->relids in those places costs something. So this version leaves that field in place. Discussion: https://postgr.es/m/553080.1657481916@sss.pgh.pa.us
* Estimate cost of elided SubqueryScan, Append, MergeAppend nodes better.Tom Lane2022-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | setrefs.c contains logic to discard no-op SubqueryScan nodes, that is, ones that have no qual to check and copy the input targetlist unchanged. (Formally it's not very nice to be applying such optimizations so late in the planner, but there are practical reasons for it; mostly that we can't unify relids between the subquery and the parent query until we flatten the rangetable during setrefs.c.) This behavior falsifies our previous cost estimates, since we would've charged cpu_tuple_cost per row just to pass data through the node. Most of the time that's little enough to not matter, but there are cases where this effect visibly changes the plan compared to what you would've gotten with no sub-select. To improve the situation, make the callers of cost_subqueryscan tell it whether they think the targetlist is trivial. cost_subqueryscan already has the qual list, so it can check the other half of the condition easily. It could make its own determination of tlist triviality too, but doing so would be repetitive (for callers that may call it several times) or unnecessarily expensive (for callers that can determine this more cheaply than a general test would do). This isn't a 100% solution, because createplan.c also does things that can falsify any earlier estimate of whether the tlist is trivial. However, it fixes nearly all cases in practice, if results for the regression tests are anything to go by. setrefs.c also contains logic to discard no-op Append and MergeAppend nodes. We did have knowledge of that behavior at costing time, but somebody failed to update it when a check on parallel-awareness was added to the setrefs.c logic. Fix that while we're here. These changes result in two minor changes in query plans shown in our regression tests. Neither is relevant to the purposes of its test case AFAICT. Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/2581077.1651703520@sss.pgh.pa.us
* Remove no-longer-used parameter for create_groupingsets_path().Tom Lane2022-07-01
| | | | | | | | numGroups is unused since commit b5635948a; let's get rid of it. XueJing Zhao, reviewed by Richard Guo Discussion: https://postgr.es/m/DM6PR05MB64923CC8B63A2CAF3B2E5D47B7AD9@DM6PR05MB6492.namprd05.prod.outlook.com
* Teach planner and executor about monotonic window funcsDavid Rowley2022-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT *, row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count(*) and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
* Optimize order of GROUP BY keysTomas Vondra2022-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When evaluating a query with a multi-column GROUP BY clause using sort, the cost may be heavily dependent on the order in which the keys are compared when building the groups. Grouping does not imply any ordering, so we're allowed to compare the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg, we simply compared keys in the order as specified in the query. This commit explores alternative ordering of the keys, trying to find a cheaper one. In principle, we might generate grouping paths for all permutations of the keys, and leave the rest to the optimizer. But that might get very expensive, so we try to pick only a couple interesting orderings based on both local and global information. When planning the grouping path, we explore statistics (number of distinct values, cost of the comparison function) for the keys and reorder them to minimize comparison costs. Intuitively, it may be better to perform more expensive comparisons (for complex data types etc.) last, because maybe the cheaper comparisons will be enough. Similarly, the higher the cardinality of a key, the lower the probability we’ll need to compare more keys. The patch generates and costs various orderings, picking the cheapest ones. The ordering of group keys may interact with other parts of the query, some of which may not be known while planning the grouping. E.g. there may be an explicit ORDER BY clause, or some other ordering-dependent operation, higher up in the query, and using the same ordering may allow using either incremental sort or even eliminate the sort entirely. The patch generates orderings and picks those minimizing the comparison cost (for various pathkeys), and then adds orderings that might be useful for operations higher up in the plan (ORDER BY, etc.). Finally, it always keeps the ordering specified in the query, on the assumption the user might have additional insights. This introduces a new GUC enable_group_by_reordering, so that the optimization may be disabled if needed. The original patch was proposed by Teodor Sigaev, and later improved and reworked by Dmitry Dolgov. Reviews by a number of people, including me, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu. Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
* Add support for MERGE SQL commandAlvaro Herrera2022-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Allow Memoize to operate in binary comparison modeDavid Rowley2021-11-24
| | | | | | | | | | | | | | | | | | | | | | Memoize would always use the hash equality operator for the cache key types to determine if the current set of parameters were the same as some previously cached set. Certain types such as floating points where -0.0 and +0.0 differ in their binary representation but are classed as equal by the hash equality operator may cause problems as unless the join uses the same operator it's possible that whichever join operator is being used would be able to distinguish the two values. In which case we may accidentally return in the incorrect rows out of the cache. To fix this here we add a binary mode to Memoize to allow it to the current set of parameters to previously cached values by comparing bit-by-bit rather than logically using the hash equality operator. This binary mode is always used for LATERAL joins and it's used for normal joins when any of the join operators are not hashable. Reported-by: Tom Lane Author: David Rowley Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us Backpatch-through: 14, where Memoize was added
* Fix typos and grammar in code commentsMichael Paquier2021-09-27
| | | | | | | | | | | Several mistakes have piled in the code comments over the time, including incorrect grammar, function names and simple typos. This commit takes care of a portion of these. No backpatch is done as this is only cosmetic. Author: Justin Pryzby Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
* Clean up some code using "(expr) ? true : false"Michael Paquier2021-09-08
| | | | | | | | | | All the code paths simplified here were already using a boolean or used an expression that led to zero or one, making the extra bits unnecessary. Author: Justin Pryzby Reviewed-by: Tom Lane, Michael Paquier, Peter Smith Discussion: https://postgr.es/m/20210428182936.GE27406@telsasoft.com
* Change NestPath node to contain JoinPath nodePeter Eisentraut2021-08-08
| | | | | | | This makes the structure of all JoinPath-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
* Get rid of artificial restriction on hash table sizes on Windows.Tom Lane2021-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The point of introducing the hash_mem_multiplier GUC was to let users reproduce the old behavior of hash aggregation, i.e. that it could use more than work_mem at need. However, the implementation failed to get the job done on Win64, where work_mem is clamped to 2GB to protect various places that calculate memory sizes using "long int". As written, the same clamp was applied to hash_mem. This resulted in severe performance regressions for queries requiring a bit more than 2GB for hash aggregation, as they now spill to disk and there's no way to stop that. Getting rid of the work_mem restriction seems like a good idea, but it's a big job and could not conceivably be back-patched. However, there's only a fairly small number of places that are concerned with the hash_mem value, and it turns out to be possible to remove the restriction there without too much code churn or any ABI breaks. So, let's do that for now to fix the regression, and leave the larger task for another day. This patch does introduce a bit more infrastructure that should help with the larger task, namely pg_bitutils.h support for working with size_t values. Per gripe from Laurent Hasson. Back-patch to v13 where the behavior change came in. Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com
* Change the name of the Result Cache node to MemoizeDavid Rowley2021-07-14
| | | | | | | | | | | "Result Cache" was never a great name for this node, but nobody managed to come up with another name that anyone liked enough. That was until David Johnston mentioned "Node Memoization", which Tom Lane revised to just "Memoize". People seem to like "Memoize", so let's do the rename. Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us Backpatch-through: 14, where Result Cache was introduced