| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When creating an explicit Sort node for the outer path of a mergejoin,
we need to determine the number of presorted keys of the outer path to
decide whether explicit incremental sort can be applied. Currently,
this is done by repeatedly calling pathkeys_count_contained_in.
This patch caches the number of presorted outer pathkeys in MergePath,
allowing us to save several calls to pathkeys_count_contained_in. It
can be considered a complement to the changes in commit 828e94c9d.
Reported-by: David Rowley <dgrowleyml@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqvBireB_w6x8BN5txdvBEHxVgZBt=rUnpf5ww5P_E_ww@mail.gmail.com
|
|
|
|
| |
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the code for inserting flag columns in the inputs of a SetOp.
That was the only reason why there would be resjunk columns in a
set-operations plan tree, so we can get rid of some code that
supported that, too.
Get rid of choose_hashed_setop() in favor of building Paths for
the hashed and sorted alternatives, and letting them fight it out
within add_path().
Remove set_operation_ordered_results_useful(), which was giving wrong
answers due to examining the wrong ancestor node: we need to examine
the immediate SetOperationStmt parent not the topmost node. Instead
make each caller of recurse_set_operations() pass down the relevant
parent node. (This thinko seems to have led only to wasted planning
cycles and possibly-inferior plans, not wrong query answers. Perhaps
we should back-patch it, but I'm not doing so right now.)
Teach generate_nonunion_paths() to consider pre-sorted inputs for
sorted SetOps, rather than always generating a Sort node.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original design for set operations involved appending the two
input relations into one and adding a flag column that allows
distinguishing which side each row came from. Then the SetOp node
pries them apart again based on the flag. This is bizarre. The
only apparent reason to do it is that when sorting, we'd only need
one Sort node not two. But since sorting is at least O(N log N),
sorting all the data is actually worse than sorting each side
separately --- plus, we have no chance of taking advantage of
presorted input. On top of that, adding the flag column frequently
requires an additional projection step that adds cycles, and then
the Append node isn't free either. Let's get rid of all of that
and make the SetOp node have two separate children, using the
existing outerPlan/innerPlan infrastructure.
This initial patch re-implements nodeSetop.c and does a bare minimum
of work on the planner side to generate correctly-shaped plans.
In particular, I've tried not to change the cost estimates here,
so that the visible changes in the regression test results will only
involve removal of useless projection steps and not any changes in
whether to use sorted vs hashed mode.
For SORTED mode, we combine successive identical tuples from each
input into groups, and then merge-join the groups. The tuple
comparisons now use SortSupport instead of simple equality, but
the group-formation part should involve roughly the same number of
tuple comparisons as before. The cross-comparisons between left and
right groups probably add to that, but I'm not sure to quantify how
many more comparisons we might need.
For HASHED mode, nodeSetop's logic is almost the same as before,
just refactored into two separate loops instead of one loop that
has an assumption that it will see all the left-hand inputs first.
In both modes, I added early-exit logic to not bother reading the
right-hand relation if the left-hand input is empty, since neither
INTERSECT nor EXCEPT modes can produce any output if the left input
is empty. This could have been done before in the hashed mode, but
not in sorted mode. Sorted mode can also stop as soon as it exhausts
the left input; any remaining right-hand tuples cannot have matches.
Also, this patch adds some infrastructure for detecting whether
child plan nodes all output the same type of tuple table slot.
If they do, the hash table logic can use slightly more efficient
code based on assuming that that's the input slot type it will see.
We'll make use of that infrastructure in other plan node types later.
Patch by me; thanks to Richard Guo and David Rowley for review.
Discussion: https://postgr.es/m/1850138.1731549611@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.
To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the startup cost and the
total cost. Hence, the path list is now sorted by disabled_nodes and
then by total_cost, instead of just by the latter, and likewise for
the partial path list. It is important that this number is a count
and not simply a Boolean; else, as soon as we're unable to respect
disabled path types in all portions of the path, we stop trying to
avoid them where we can.
Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.
Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).
Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley.
Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't currently support parameterized MergeAppend paths: there's
little use for an ordered path on the inside of a nestloop. Given
this, we can simplify create_merge_append_path by directly setting
param_info to NULL instead of calling get_appendrel_parampathinfo. We
can also simplify the Assert for child paths a little bit.
This change won't make any measurable difference in performance; it's
just for clarity's sake.
Author: Richard Guo
Reviewed-by: Alena Rybakina, Paul A Jungwirth
Discussion: https://postgr.es/m/CAMbWs4_n1bgH2nACMuGsXZct3KH6PBFS0tPdQsXdstRfyxTunQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In create_gather_merge_path, we should always guarantee that the
subpath is adequately ordered, and we do not add a Sort node in
createplan.c for a Gather Merge node. Therefore, the 'else' branch in
create_gather_merge_path, which computes the cost for a Sort node, is
redundant.
This patch removes the redundant code and emits an error if the
subpath is not sufficiently ordered. Meanwhile, this patch changes
the check for the subpath's pathkeys in create_gather_merge_plan to an
Assert.
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs48u=0bWf3epVtULjJ-=M9Hbkz+ieZQAOS=BfbXZFqbDCg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case of a parallel plan, when computing the number of tuples
processed per worker, we divide the total number of tuples by the
parallel_divisor obtained from get_parallel_divisor(), which accounts
for the leader's contribution in addition to the number of workers.
Accordingly, when estimating the number of tuples for gather (merge)
nodes, we should multiply the number of tuples per worker by the same
parallel_divisor to reverse the division. However, currently we use
parallel_workers rather than parallel_divisor for the multiplication.
This could result in an underestimation of the number of tuples for
gather (merge) nodes, especially when there are fewer than four
workers.
This patch fixes this issue by using the same parallel_divisor for the
multiplication. There is one ensuing plan change in the regression
tests, but it looks reasonable and does not compromise its original
purpose of testing parallel-aware hash join.
In passing, this patch removes an unnecessary assignment for path.rows
in create_gather_merge_path, and fixes an uninitialized-variable issue
in generate_useful_gather_paths.
No backpatch as this could result in plan changes.
Author: Anthonin Bonnefoy
Reviewed-by: Rafia Sabih, Richard Guo
Discussion: https://postgr.es/m/CAO6_Xqr9+51NxgO=XospEkUeAg-p=EjAWmtpdcZwjRgGKJ53iA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In cost_memoize_rescan(), when calculating the hit_ratio using the calls
and ndistinct estimations, if the value that was set in
MemoizePath.calls had not been processed through clamp_row_est(), then it
was possible that it was set to some non-integer value which could result
in ndistinct being 1 higher than calls due to estimate_num_groups()
performing clamp_row_est() on its input_rows. This could result in
hit_ratio values slightly below 0.0, which would cause an Assert failure.
The value of MemoizePath.calls comes from the final parameter in the
create_memoize_path() function, of which we only have one true caller of.
That caller passes outer_path->rows. All the core code I looked at
always seems to call clamp_row_est() on the Path.rows, so there might
have been no issues with any core Paths causing troubles here. The bug
report was about a CustomPath with a non-clamped row estimated.
The misbehavior as a result of this seems to be mostly limited to the
Assert() failing. Aside from that, it seems the Memoize costs would
just come out slightly higher than they should have, which is likely
fairly harmless.
Reported-by: Kohei KaiGai <kaigai@heterodb.com>
Discussion: https://postgr.es/m/CAOP8fzZnTU+N64UYJYogb1hN-5hFP+PwTb3m_cnGAD7EsQwrKw@mail.gmail.com
Reviewed-by: Richard Guo
Backpatch-through: 14, where Memoize was introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
94985c210 added code to detect when WindowFuncs were monotonic and
allowed additional quals to be "pushed down" into the subquery to be
used as WindowClause runConditions in order to short-circuit execution
in nodeWindowAgg.c.
The Node representation of runConditions wasn't well selected and
because we do qual pushdown before planning the subquery, the planning
of the subquery could perform subquery pull-up of nested subqueries.
For WindowFuncs with args, the arguments could be changed after pushing
the qual down to the subquery.
This was made more difficult by the fact that the code duplicated the
WindowFunc inside an OpExpr to include in the WindowClauses runCondition
field. This could result in duplication of subqueries and a pull-up of
such a subquery could result in another initplan parameter being issued
for the 2nd version of the subplan. This could result in errors such as:
ERROR: WindowFunc not found in subplan target lists
To fix this, we change the node representation of these run conditions
and instead of storing an OpExpr containing the WindowFunc in a list
inside WindowClause, we now store a new node type named
WindowFuncRunCondition within a new field in the WindowFunc. These get
transformed into OpExprs later in planning once subquery pull-up has been
performed.
This problem did exist in v15 and v16, but that was fixed by 9d36b883b
and e5d20bbd.
Cat version bump due to new node type and modifying WindowFunc struct.
Bug: #18305
Reported-by: Zuming Jiang
Discussion: https://postgr.es/m/18305-33c49b4c830b37b3%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE
actions, which operate on rows that exist in the target relation, but
not in the data source. These actions can execute UPDATE, DELETE, or
DO NOTHING sub-commands.
This is in contrast to already-supported WHEN NOT MATCHED actions,
which operate on rows that exist in the data source, but not in the
target relation. To make this distinction clearer, such actions may
now be written as WHEN NOT MATCHED BY TARGET.
Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is
equivalent to writing WHEN NOT MATCHED BY TARGET.
Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing.
Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we know the sort order of a CTE's output, and it is relevant
to the outer query, label the CTE's outer-query access path using
those pathkeys. This may enable optimizations such as avoiding
a sort in the outer query.
The code for hoisting pathkeys into the outer query already exists
for regular RTE_SUBQUERY subqueries, but it wasn't getting used for
CTEs, possibly out of concern for maintaining an optimization fence
between the CTE and the outer query. However, on the same arguments
used for commit f7816aec2, there seems no harm in letting the outer
query know what the inner query decided to do.
In support of this, we now remember the best Path as well as Plan
for each subquery for the rest of the planner run. There may be
future applications for having that at hand, and it surely costs
little to build one more List.
Richard Guo (minor mods by me)
Discussion: https://postgr.es/m/CAMbWs49xYd3f8CrE8-WW3--dV1zH_sDSDn-vs2DzHj81Wcnsew@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically, this commit reduces the memory consumed by the
SpecialJoinInfos that are allocated for child joins in
try_partitionwise_join() by freeing them at the end of creating paths
for each child join.
A SpecialJoinInfo allocated for a given child join is a copy of the
parent join's SpecialJoinInfo, which contains the translated copies
of the various Relids bitmapsets and semi_rhs_exprs, which is a List
of Nodes. The newly added freeing step frees the struct itself and
the various bitmapsets, but not semi_rhs_exprs, because there's no
handy function to free the memory of Node trees.
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/CAExHW5tHqEf3ASVqvFFcghYGPfpy7o3xnvhHwBGbJFMRH8KjNw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When considering nestloop paths for individual partitions within
a partitionwise join, if the inner path is parameterized, it is
parameterized by the topmost parent of the outer rel, not the
corresponding outer rel itself. Therefore, we need to translate the
parameterization so that the inner path is parameterized by the
corresponding outer rel.
Up to now, we did this while generating join paths. However, that's
problematic because we must also translate some expressions that are
shared across all paths for a relation, such as restriction clauses
(kept in the RelOptInfo and/or IndexOptInfo) and TableSampleClauses
(kept in the RangeTblEntry). The existing code fails to translate
these at all, leading to wrong answers, odd failures such as
"variable not found in subplan target list", or executor crashes.
But we can't modify them during path generation, because that would
break things if we end up choosing some non-partitioned-join path.
So this patch postpones reparameterization of the inner path until
createplan.c, where it is safe to modify the referenced RangeTblEntry,
RelOptInfo or IndexOptInfo, because we have made a final choice of which
Path to use. We do still have to check during path generation that
the reparameterization will be possible. So we introduce a new
function path_is_reparameterizable_by_child() to detect that.
The duplication between path_is_reparameterizable_by_child() and
reparameterize_path_by_child() is a bit annoying, but there seems
no other good answer. A small benefit is that we can avoid building
useless reparameterized trees in cases where a non-partitioned join
is ultimately chosen. Also, reparameterize_path_by_child() can now
be allowed to scribble on the input paths, saving a few cycles.
This fix repairs the same problems previously addressed in the
back branches by commits 62f120203 et al.
Richard Guo, reviewed at various times by Ashutosh Bapat, Andrei
Lepikhov, Alena Rybakina, Robert Haas, and myself
Discussion: https://postgr.es/m/CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as determined by include-what-you-use (IWYU)
While IWYU also suggests to *add* a bunch of #include's (which is its
main purpose), this patch does not do that. In some cases, a more
specific #include replaces another less specific one.
Some manual adjustments of the automatic result:
- IWYU currently doesn't know about includes that provide global
variable declarations (like -Wmissing-variable-declarations), so
those includes are being kept manually.
- All includes for port(ability) headers are being kept for now, to
play it safe.
- No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the
patch from exploding in size.
Note that this patch touches just *.c files, so nothing declared in
header files changes in hidden ways.
As a small example, in src/backend/access/transam/rmgr.c, some IWYU
pragma annotations are added to handle a special case there.
Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
build_child_join_sjinfo creates a derived SpecialJoinInfo in
the short-lived GEQO context, but afterwards the semi_rhs_exprs
from that may be used in a UniquePath for a child base relation.
This breaks the expectation that all base-relation-level structures
are in the planning-lifespan context, leading to use of a dangling
pointer with probable ensuing crash later on in create_unique_plan.
To fix, copy the expression trees when making a UniquePath.
Per bug #18360 from Alexander Lakhin. This has been broken since
partitionwise joins were added, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/18360-a23caf3157f34e62@postgresql.org
|
|
|
|
|
|
|
|
|
|
| |
This is extracted from a larger patch to improve the UNION planner.
While working on that, I found myself having to check what the 'rows'
parameter is for. It's not obvious that passing a negative number is the
way to have the rows estimate calculated and to find that out you need
to read code in create_append_path() and in cost_append().
Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building a MergeAppendPath which has child paths that are not
sorted correctly for the MergeAppend's sort order, we apply the cost of
sorting those paths to the MergeAppendPath costs.
Here we fix a bug where the number of tuples specified that needed to be
sorted was effectively pg_class.reltuples rather than the number of
expected row in the subpath. This effectively penalizes MergeAppend
plans any time any filter is present on the MergeAppend subpath as the
sort cost added is to sort all tuples in the table rather than just the
ones expected the path to return.
This did not affect UNION ALL type queries as the RelOptInfo tuples is
set from the subquery's path rows. It does affect MergeAppends uses for
inheritance and partitioned tables.
This is a long-standing bug introduced when MergeAppend was first added
in 11cad29c9. No backpatch as this could result in plan changes.
Author: Alexander Kuzmenkov
Reviewed-by: Ashutosh Bapat, Aleksander Alekseev, David Rowley
Discussion: https://postgr.es/m/CALzhyqyhoXQDR-Usd_0HeWk%3DuqNLzoVeT8KhRoo%3DpV_KzgO3QQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were not testing the same thing as the comparable Assert
in calc_nestloop_required_outer(), because we neglected to map
the given Paths' relids to top-level relids. When considering
a partition child join the latter is the correct thing to do.
This oversight is old, but since it's only an overly-weak Assert
check there doesn't seem to be much value in back-patching.
Richard Guo (with cosmetic changes and comment updates by me)
Discussion: https://postgr.es/m/CAMbWs49sqbe9GBZ8sy8dSfKRNURgicR85HX8vgzcgQsPF0XY1w@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since C99, there can be a trailing comma after the last value in an
enum definition. A lot of new code has been introducing this style on
the fly. Some new patches are now taking an inconsistent approach to
this. Some add the last comma on the fly if they add a new last
value, some are trying to preserve the existing style in each place,
some are even dropping the last comma if there was one. We could
nudge this all in a consistent direction if we just add the trailing
commas everywhere once.
I omitted a few places where there was a fixed "last" value that will
always stay last. I also skipped the header files of libpq and ecpg,
in case people want to use those with older compilers. There were
also a small number of cases where the enum type wasn't used anywhere
(but the enum values were), which ended up confusing pgindent a bit,
so I left those alone.
Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an UPDATE/DELETE/MERGE's target table is an old-style
inheritance tree, it's possible for the parent to get excluded
from the plan while some children are not. (I believe this is
only possible if we can prove that a CHECK ... NO INHERIT
constraint on the parent contradicts the query WHERE clause,
so it's a very unusual case.) In such a case, ExecInitModifyTable
mistakenly concluded that the first surviving child is the target
table, leading to at least two bugs:
1. The wrong table's statement-level triggers would get fired.
2. In v16 and up, it was possible to fail with "invalid perminfoindex
0 in RTE with relid nnnn" due to the child RTE not having permissions
data included in the query plan. This was hard to reproduce reliably
because it did not occur unless the update triggered some non-HOT
index updates.
In v14 and up, this is easy to fix by defining ModifyTable.rootRelation
to be the parent RTE in plain inheritance as well as partitioned cases.
While the wrong-triggers bug also appears in older branches, the
relevant code in both the planner and executor is quite a bit
different, so it would take a good deal of effort to develop and
test a suitable patch. Given the lack of field complaints about the
trigger issue, I'll desist for now. (Patching v11 for this seems
unwise anyway, given that it will have no more releases after next
month.)
Per bug #18147 from Hans Buschmann.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/18147-6fc796538913ee88@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1349d2790 added code to adjust the PlannerInfo.group_pathkeys so that
ORDER BY / DISTINCT aggregate functions could obtain pre-sorted inputs
to allow faster execution. That commit forgot to adjust the pathkeys in
create_agg_path(). Some code in there assumed that it was always fine
to make the AggPath's pathkeys the same as its subpath's. That seems to
have been ok up until 1349d2790, but since that commit adds pathkeys for
columns which are within the aggregate function, those columns won't be
available above the aggregate node. This can result in "could not find
pathkey item to sort" during create_plan().
The fix here is to strip off the additional pathkeys added by
adjust_group_pathkeys_for_groupagg(). It seems that the pathkeys here
will only ever be group_pathkeys, so all we need to do is check if the
length of the pathkey list is longer than the num_groupby_pathkeys and
get rid of the additional ones only if we see extras.
Reported-by: Justin Pryzby
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/ZQhYYRhUxpW3PSf9%40telsasoft.com
Backpatch-through: 16, where 1349d2790 was introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
quals.
This was disabled in commit 6f80a8d9c due to the lack of support for
handling of pseudoconstant quals assigned to replaced joins in
createplan.c. To re-allow it, this patch adds the support by 1)
modifying the ForeignPath and CustomPath structs so that if they
represent foreign and custom scans replacing a join with a scan, they
store the list of RestrictInfo nodes to apply to the join, as in
JoinPaths, and by 2) modifying create_scan_plan() in createplan.c so
that it uses that list in that case, instead of the baserestrictinfo
list, to get pseudoconstant quals assigned to the join, as mentioned in
the commit message for that commit.
Important item for the release notes: this is non-backwards-compatible
since it modifies the ForeignPath and CustomPath structs, as mentioned
above, and changes the argument lists for FDW helper functions
create_foreignscan_path(), create_foreign_join_path(), and
create_foreign_upper_path().
Richard Guo, with some additional changes by me, reviewed by Nishant
Sharma, Suraj Kharage, and Richard Guo.
Discussion: https://postgr.es/m/CADrsxdbcN1vejBaf8a%2BQhrZY5PXL-04mCd4GDu6qm6FigDZd6Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we adjust the costs for WindowAggs so that they properly take into
account how much of their subnode they must read before outputting the
first row. Without this, we always assumed that the startup cost for the
WindowAgg was not much more expensive than the startup cost of its
subnode, however, that's going to be completely wrong in many cases. The
WindowAgg may have to read *all* of its subnode to output a single row
with certain window bound options.
Here we estimate how many rows we'll need to read from the WindowAgg's
subnode and proportionally add more of the subnode's run costs onto the
WindowAgg's startup costs according to how much of it we expect to have to
read in order to produce the first WindowAgg row.
The reason this is more important than we might have initially thought is
that we may end up making use of a path from the lower planner that works
well as a cheap startup plan when the query has a LIMIT clause, however,
the WindowAgg might mean we need to read far more rows than what the LIMIT
specifies.
No backpatch on this so as not to cause plan changes in released
versions.
Bug: #17862
Reported-by: Tim Palmer
Author: David Rowley
Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/17862-1ab8f74b0f7b0611@postgresql.org
Discussion: https://postgr.es/m/CAApHDvrB0S5BMv+0-wTTqWFE-BJ0noWqTnDu9QQfjZ2VSpLv_g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the plan itself is parallel-safe, and the initPlans are too,
there's no reason anymore to prevent the plan from being marked
parallel-safe. That restriction (dating to commit ab77a5a45) was
really a special case of the fact that we couldn't transmit subplans
to parallel workers at all. We fixed that in commit 5e6d8d2bb and
follow-ons, but this case never got addressed.
We still forbid attaching initPlans to a Gather node that's
inserted pursuant to debug_parallel_query = regress. That's because,
when we hide the Gather from EXPLAIN output, we'd hide the initPlans
too, causing cosmetic regression diffs. It seems inadvisable to
kluge EXPLAIN to the extent required to make the output look the
same, so just don't do it in that case.
Along the way, this also takes care of some sloppiness about updating
path costs to match when we move initplans from one place to another
during createplan.c and setrefs.c. Since all the planning decisions
are already made by that point, this is just cosmetic; but it seems
good to keep EXPLAIN output consistent with where the initplans are.
The diff in query_planner() might be worth remarking on. I found that
one because after fixing things to allow parallel-safe initplans, one
partition_prune test case changed plans (as shown in the patch) ---
but only when debug_parallel_query was active. The reason proved to
be that we only bothered to mark Result nodes as potentially
parallel-safe when debug_parallel_query is on. This neglects the fact
that parallel-safety may be of interest for a sub-query even though
the Result itself doesn't parallelize.
Discussion: https://postgr.es/m/1129530.1681317832@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our policy since commit ab77a5a45 has been that a plan node having
any initplans is automatically not parallel-safe. (This could be
relaxed, but not today.) clean_up_removed_plan_level neglected
this, and could attach initplans to a parallel-safe child plan
node without clearing the plan's parallel-safe flag. That could
lead to "subplan was not initialized" errors at runtime, in case
an initplan referenced another one and only the referencing one
got transmitted to parallel workers.
The fix in clean_up_removed_plan_level is trivial enough.
materialize_finished_plan also moves initplans from one node
to another, but it's okay because it already copies the source
node's parallel_safe flag. The other place that does this kind
of thing is standard_planner's hack to inject a top-level Gather
when debug_parallel_query is active. But that's actually dead
code given that we're correctly enforcing the "initplans aren't
parallel safe" rule, so just replace it with an Assert that
there are no initplans.
Also improve some related comments.
Normally we'd add a regression test case for this sort of bug.
The mistake itself is already reached by existing tests, but there
is accidentally no visible problem. The only known test case that
creates an actual failure seems too indirect and fragile to justify
keeping it as a regression test (not least because it fails to fail
in v11, though the bug is clearly present there too).
Per report from Justin Pryzby. Back-patch to all supported branches.
Discussion: https://postgr.es/m/ZDVt6MaNWkRDO1LQ@telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
create_append_path() would only apply get_baserel_parampathinfo
when the path is for a partitioned table, but it's also potentially
useful for paths for UNION ALL appendrels. Specifically, that
supports building a Memoize path atop this one.
While we're in the vicinity, delete some dead code in
create_merge_append_plan(): there's no need for it to support
parameterized MergeAppend paths, and it doesn't look like that
is going to change anytime soon. It'll be easy enough to undo
this when/if it becomes useful.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_ABSu4PWG2rE1q10tJugEXHWgru3U8dAgkoFvgrb6aEA@mail.gmail.com
|
|
|
|
|
|
|
|
| |
These are all dead code now that it's done centrally.
Patch by me; thanks to Nathan Bossart and Richard Guo for review.
Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here remove some dead code from heapgettup() and heapgettup_pagemode()
which was trying to support NoMovementScanDirection scans. This code can
never be reached as standard_ExecutorRun() never calls ExecutePlan with
NoMovementScanDirection.
Additionally, plans which were scanning an unordered index would use
NoMovementScanDirection rather than ForwardScanDirection. There was no
real need for this, so here we adjust this so we use ForwardScanDirection
for unordered index scans. A comment in pathnodes.h claimed that
NoMovementScanDirection was used for PathKey reasons, but if that was
true, it no longer is, per code in build_index_paths().
This does change the non-text format of the EXPLAIN output so that
unordered index scans now have a "Forward" scan direction rather than
"NoMovement". The text format of EXPLAIN has not changed.
Author: Melanie Plageman
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
|
|
|
|
| |
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple of places weren't up to speed for this. By sheer good
luck, we didn't fail but just selected a non-memoized join plan,
at least in the test case we have. Nonetheless, it's a bug,
and I'm not quite sure that it couldn't have worse consequences
in other examples. So back-patch to v14 where Memoize came in.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs48GkNom272sfp0-WeD6_0HSR19BJ4H1c9ZKSfbVnJsvRg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It neglected to recurse to the subpath, meaning you'd get back
a path identical to the input. This could produce wrong query
results if the omission meant that the subpath fails to enforce
some join clause it should be enforcing. We don't have a test
case for this at the moment, but the code is obviously broken
and the fix is equally obvious. Back-patch to v14 where
Memoize was introduced.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_R=ORpz=Lkn2q3ebPC5EuWyfZF+tmfCPVLBVK5W39mHA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
These two functions failed to cover MaterialPath. That's not a
fatal problem, but we can generate better plans in some cases
if we support it.
Tom Lane and Richard Guo
Discussion: https://postgr.es/m/1854233.1669949723@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
Oversight in 7103ebb7aae8. Backpatch to 15.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48gnDjZXq3-b56dVpQCNUJ5hD9kdtWN4QFwKCEapspNsA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit db0d67db2401eb6238ccc04c6407a4fd4f985832 and
several follow-on fixes. The idea of making a cost-based choice
of the order of the sorting columns is not fundamentally unsound,
but it requires cost information and data statistics that we don't
really have. For example, relying on procost to distinguish the
relative costs of different sort comparators is pretty pointless
so long as most such comparator functions are labeled with cost 1.0.
Moreover, estimating the number of comparisons done by Quicksort
requires more than just an estimate of the number of distinct values
in the input: you also need some idea of the sizes of the larger
groups, if you want an estimate that's good to better than a factor of
three or so. That's data that's often unknown or not very reliable.
Worse, to arrive at estimates of the number of calls made to the
lower-order-column comparison functions, the code needs to make
estimates of the numbers of distinct values of multiple columns,
which are necessarily even less trustworthy than per-column stats.
Even if all the inputs are perfectly reliable, the cost algorithm
as-implemented cannot offer useful information about how to order
sorting columns beyond the point at which the average group size
is estimated to drop to 1.
Close inspection of the code added by db0d67db2 shows that there
are also multiple small bugs. These could have been fixed, but
there's not much point if we don't trust the estimates to be
accurate in-principle.
Finally, the changes in cost_sort's behavior made for very large
changes (often a factor of 2 or so) in the cost estimates for all
sorting operations, not only those for multi-column GROUP BY.
That naturally changes plan choices in many situations, and there's
precious little evidence to show that the changes are for the better.
Given the above doubts about whether the new estimates are really
trustworthy, it's hard to summon much confidence that these changes
are better on the average.
Since we're hard up against the release deadline for v15, let's
revert these changes for now. We can always try again later.
Note: in v15, I left T_PathKeyInfo in place in nodes.h even though
it's unreferenced. Removing it would be an ABI break, and it seems
a bit late in the release cycle for that.
Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The present implementations of adjust_appendrel_attrs_multilevel and
its sibling adjust_child_relids_multilevel are very messy, because
they work by reconstructing the relids of the child's immediate
parent and then seeing if that's bms_equal to the relids of the
target parent. Aside from being quite inefficient, this will not
work with planned future changes to make joinrels' relid sets
contain outer-join relids in addition to baserels.
The whole thing can be solved at a stroke by adding explicit parent
and top_parent links to child RelOptInfos, and making these functions
work with RelOptInfo pointers instead of relids. Doing that is
simpler for most callers, too.
In my original version of this patch, I got rid of
RelOptInfo.top_parent_relids on the grounds that it was now redundant.
However, that adds a lot of code churn in places that otherwise would
not need changing, and arguably the extra indirection needed to fetch
top_parent->relids in those places costs something. So this version
leaves that field in place.
Discussion: https://postgr.es/m/553080.1657481916@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
setrefs.c contains logic to discard no-op SubqueryScan nodes, that is,
ones that have no qual to check and copy the input targetlist unchanged.
(Formally it's not very nice to be applying such optimizations so late
in the planner, but there are practical reasons for it; mostly that we
can't unify relids between the subquery and the parent query until we
flatten the rangetable during setrefs.c.) This behavior falsifies our
previous cost estimates, since we would've charged cpu_tuple_cost per
row just to pass data through the node. Most of the time that's little
enough to not matter, but there are cases where this effect visibly
changes the plan compared to what you would've gotten with no
sub-select.
To improve the situation, make the callers of cost_subqueryscan tell
it whether they think the targetlist is trivial. cost_subqueryscan
already has the qual list, so it can check the other half of the
condition easily. It could make its own determination of tlist
triviality too, but doing so would be repetitive (for callers that
may call it several times) or unnecessarily expensive (for callers
that can determine this more cheaply than a general test would do).
This isn't a 100% solution, because createplan.c also does things
that can falsify any earlier estimate of whether the tlist is
trivial. However, it fixes nearly all cases in practice, if results
for the regression tests are anything to go by.
setrefs.c also contains logic to discard no-op Append and MergeAppend
nodes. We did have knowledge of that behavior at costing time, but
somebody failed to update it when a check on parallel-awareness was
added to the setrefs.c logic. Fix that while we're here.
These changes result in two minor changes in query plans shown in
our regression tests. Neither is relevant to the purposes of its
test case AFAICT.
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/2581077.1651703520@sss.pgh.pa.us
|
|
|
|
|
|
|
|
| |
numGroups is unused since commit b5635948a; let's get rid of it.
XueJing Zhao, reviewed by Richard Guo
Discussion: https://postgr.es/m/DM6PR05MB64923CC8B63A2CAF3B2E5D47B7AD9@DM6PR05MB6492.namprd05.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Window functions such as row_number() always return a value higher than
the previously returned value for tuples in any given window partition.
Traditionally queries such as;
SELECT * FROM (
SELECT *, row_number() over (order by c) rn
FROM t
) t WHERE rn <= 10;
were executed fairly inefficiently. Neither the query planner nor the
executor knew that once rn made it to 11 that nothing further would match
the outer query's WHERE clause. It would blindly continue until all
tuples were exhausted from the subquery.
Here we implement means to make the above execute more efficiently.
This is done by way of adding a pg_proc.prosupport function to various of
the built-in window functions and adding supporting code to allow the
support function to inform the planner if the window function is
monotonically increasing, monotonically decreasing, both or neither. The
planner is then able to make use of that information and possibly allow
the executor to short-circuit execution by way of adding a "run condition"
to the WindowAgg to allow it to determine if some of its execution work
can be skipped.
This "run condition" is not like a normal filter. These run conditions
are only built using quals comparing values to monotonic window functions.
For monotonic increasing functions, quals making use of the btree
operators for <, <= and = can be used (assuming the window function column
is on the left). You can see here that once such a condition becomes false
that a monotonic increasing function could never make it subsequently true
again. For monotonically decreasing functions the >, >= and = btree
operators for the given type can be used for run conditions.
The best-case situation for this is when there is a single WindowAgg node
without a PARTITION BY clause. Here when the run condition becomes false
the WindowAgg node can simply return NULL. No more tuples will ever match
the run condition. It's a little more complex when there is a PARTITION
BY clause. In this case, we cannot return NULL as we must still process
other partitions. To speed this case up we pull tuples from the outer
plan to check if they're from the same partition and simply discard them
if they are. When we find a tuple belonging to another partition we start
processing as normal again until the run condition becomes false or we run
out of tuples to process.
When there are multiple WindowAgg nodes to evaluate then this complicates
the situation. For intermediate WindowAggs we must ensure we always
return all tuples to the calling node. Any filtering done could lead to
incorrect results in WindowAgg nodes above. For all intermediate nodes,
we can still save some work when the run condition becomes false. We've
no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot
reference the value of these and these tuples will not appear in the final
result anyway. The savings here are small in comparison to what can be
saved in the top-level WingowAgg, but still worthwhile.
Intermediate WindowAgg nodes never filter out tuples, but here we change
WindowAgg so that the top-level WindowAgg filters out tuples that don't
match the intermediate WindowAgg node's run condition. Such filters
appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node.
Here we add prosupport functions to allow the above to work for;
row_number(), rank(), dense_rank(), count(*) and count(expr). It appears
technically possible to do the same for min() and max(), however, it seems
unlikely to be useful enough, so that's not done here.
Bump catversion
Author: David Rowley
Reviewed-by: Andy Fan, Zhihong Yu
Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When evaluating a query with a multi-column GROUP BY clause using sort,
the cost may be heavily dependent on the order in which the keys are
compared when building the groups. Grouping does not imply any ordering,
so we're allowed to compare the keys in arbitrary order, and a Hash Agg
leverages this. But for Group Agg, we simply compared keys in the order
as specified in the query. This commit explores alternative ordering of
the keys, trying to find a cheaper one.
In principle, we might generate grouping paths for all permutations of
the keys, and leave the rest to the optimizer. But that might get very
expensive, so we try to pick only a couple interesting orderings based
on both local and global information.
When planning the grouping path, we explore statistics (number of
distinct values, cost of the comparison function) for the keys and
reorder them to minimize comparison costs. Intuitively, it may be better
to perform more expensive comparisons (for complex data types etc.)
last, because maybe the cheaper comparisons will be enough. Similarly,
the higher the cardinality of a key, the lower the probability we’ll
need to compare more keys. The patch generates and costs various
orderings, picking the cheapest ones.
The ordering of group keys may interact with other parts of the query,
some of which may not be known while planning the grouping. E.g. there
may be an explicit ORDER BY clause, or some other ordering-dependent
operation, higher up in the query, and using the same ordering may allow
using either incremental sort or even eliminate the sort entirely.
The patch generates orderings and picks those minimizing the comparison
cost (for various pathkeys), and then adds orderings that might be
useful for operations higher up in the plan (ORDER BY, etc.). Finally,
it always keeps the ordering specified in the query, on the assumption
the user might have additional insights.
This introduces a new GUC enable_group_by_reordering, so that the
optimization may be disabled if needed.
The original patch was proposed by Teodor Sigaev, and later improved and
reworked by Dmitry Dolgov. Reviews by a number of people, including me,
Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu.
Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra
Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu
Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MERGE performs actions that modify rows in the target table using a
source table or query. MERGE provides a single SQL statement that can
conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise
require multiple PL statements. For example,
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
DO NOTHING;
MERGE works with regular tables, partitioned tables and inheritance
hierarchies, including column and row security enforcement, as well as
support for row and statement triggers and transition tables therein.
MERGE is optimized for OLTP and is parameterizable, though also useful
for large scale ETL/ELT. MERGE is not intended to be used in preference
to existing single SQL commands for INSERT, UPDATE or DELETE since there
is some overhead. MERGE can be used from PL/pgSQL.
MERGE does not support targetting updatable views or foreign tables, and
RETURNING clauses are not allowed either. These limitations are likely
fixable with sufficient effort. Rewrite rules are also not supported,
but it's not clear that we'd want to support them.
Author: Pavan Deolasee <pavan.deolasee@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Simon Riggs <simon.riggs@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions)
Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions)
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
|
|
|
|
| |
Backpatch-through: 10
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Memoize would always use the hash equality operator for the cache key
types to determine if the current set of parameters were the same as some
previously cached set. Certain types such as floating points where -0.0
and +0.0 differ in their binary representation but are classed as equal by
the hash equality operator may cause problems as unless the join uses the
same operator it's possible that whichever join operator is being used
would be able to distinguish the two values. In which case we may
accidentally return in the incorrect rows out of the cache.
To fix this here we add a binary mode to Memoize to allow it to the
current set of parameters to previously cached values by comparing
bit-by-bit rather than logically using the hash equality operator. This
binary mode is always used for LATERAL joins and it's used for normal
joins when any of the join operators are not hashable.
Reported-by: Tom Lane
Author: David Rowley
Discussion: https://postgr.es/m/3004308.1632952496@sss.pgh.pa.us
Backpatch-through: 14, where Memoize was added
|
|
|
|
|
|
|
|
|
|
|
| |
Several mistakes have piled in the code comments over the time,
including incorrect grammar, function names and simple typos. This
commit takes care of a portion of these.
No backpatch is done as this is only cosmetic.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
|
|
|
|
|
|
|
|
|
|
| |
All the code paths simplified here were already using a boolean or used
an expression that led to zero or one, making the extra bits
unnecessary.
Author: Justin Pryzby
Reviewed-by: Tom Lane, Michael Paquier, Peter Smith
Discussion: https://postgr.es/m/20210428182936.GE27406@telsasoft.com
|
|
|
|
|
|
|
| |
This makes the structure of all JoinPath-derived nodes the same,
independent of whether they have additional fields.
Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The point of introducing the hash_mem_multiplier GUC was to let users
reproduce the old behavior of hash aggregation, i.e. that it could use
more than work_mem at need. However, the implementation failed to get
the job done on Win64, where work_mem is clamped to 2GB to protect
various places that calculate memory sizes using "long int". As
written, the same clamp was applied to hash_mem. This resulted in
severe performance regressions for queries requiring a bit more than
2GB for hash aggregation, as they now spill to disk and there's no
way to stop that.
Getting rid of the work_mem restriction seems like a good idea, but
it's a big job and could not conceivably be back-patched. However,
there's only a fairly small number of places that are concerned with
the hash_mem value, and it turns out to be possible to remove the
restriction there without too much code churn or any ABI breaks.
So, let's do that for now to fix the regression, and leave the
larger task for another day.
This patch does introduce a bit more infrastructure that should help
with the larger task, namely pg_bitutils.h support for working with
size_t values.
Per gripe from Laurent Hasson. Back-patch to v13 where the
behavior change came in.
Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us
Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
| |
"Result Cache" was never a great name for this node, but nobody managed
to come up with another name that anyone liked enough. That was until
David Johnston mentioned "Node Memoization", which Tom Lane revised to
just "Memoize". People seem to like "Memoize", so let's do the rename.
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us
Backpatch-through: 14, where Result Cache was introduced
|