| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pathkeys_useful_for_ordering() contained some needless checks to return
0 when either root->query_pathkeys or pathkeys lists were empty. This is
already handled by pathkeys_count_contained_in(), so let's have it do the
work instead of having redundant checks.
Similarly, in pathkeys_useful_for_grouping(), checking pathkeys is an
empty list just before looping over it isn't required. Technically,
neither is the list empty check for group_pathkeys, but I felt a bit
more work would have to be done to get the equivalent behavior if we'd
left it up to the foreach loop to call list_member_ptr().
This was noticed by Andy while he was reviewing a patch to improve the
UNION planner. Since that patch adds another function similar to
pathkeys_useful_for_ordering() and since I wasn't planning to copy these
redundant checks over to the new function, let's adjust the existing
code so that both functions will be consistent.
Author: Andy Fan
Discussion: https://postgr.es/m/87o7cti48f.fsf@163.com
|
|
|
|
|
|
|
|
|
|
| |
This is extracted from a larger patch to improve the UNION planner.
While working on that, I found myself having to check what the 'rows'
parameter is for. It's not obvious that passing a negative number is the
way to have the rows estimate calculated and to find that out you need
to read code in create_append_path() and in cost_append().
Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
group_keys_reorder_by_pathkeys() function searched for matching pathkeys
within root->group_pathkeys. That could lead to picking an aggregate pathkey
and using its pathkey->pk_eclass->ec_sortref as an argument of
get_sortgroupref_clause_noerr(). Given that ec_sortref of an aggregate pathkey
references aggregate targetlist not query targetlist, this leads to incorrect
query optimization.
Fix this by looking for matching pathkeys only within the first
num_groupby_pathkeys pathkeys.
Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwY3Ek%3DcLThgd8FdaSc5JRDVt0FaV00gMcWra%2BTAR4gGUw%40mail.gmail.com
Author: Andrei Lepikhov, Alexander Korotkov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A comment in grouping_planner() claimed that the PlannerInfo
upper_targets array was not used in core code. However, the code that
generated the paths for the UPPERREL_PARTIAL_DISTINCT rel made that
comment untrue.
Here we adjust the create_distinct_paths() function signature to pass
down the PathTarget the same as is done for create_grouping_paths(),
thus making the aforementioned comment true again.
In passing adjust the order of the upper_targets[] assignments. These
seem to be following the reverse enum order apart from
UPPERREL_PARTIAL_DISTINCT.
Also, update the header comment for generate_gather_paths() to mention
the function is also used to create gather paths for partial distinct
paths.
Author: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAMbWs48u9VoVOouJsys1qOaC9WVGVmBa+wT1dx8KvxF5GPzezA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we adjust the partial path generation for parallel DISTINCT queries
to add Sort nodes on top of any unsorted partial distinct paths.
This increases the likelihood of the planner pushing a Sort below a Gather
Merge which enables the final phase of the parallel distinct to be
implemented using a Unique node in more cases.
Sorting the partial distinct paths is particularly useful when the
DISTINCT query has an ORDER BY and LIMIT clause as this can allow cheaper
plans by having the workers Hash Aggregate then Sort before feeding the
results into the Gather Merge. The non-parallel portion of the plan then
becomes very cheap as it leaves only Unique and Limit to do in the leader
process.
Author: Richard Guo
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAMbWs48u9VoVOouJsys1qOaC9WVGVmBa+wT1dx8KvxF5GPzezA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building a MergeAppendPath which has child paths that are not
sorted correctly for the MergeAppend's sort order, we apply the cost of
sorting those paths to the MergeAppendPath costs.
Here we fix a bug where the number of tuples specified that needed to be
sorted was effectively pg_class.reltuples rather than the number of
expected row in the subpath. This effectively penalizes MergeAppend
plans any time any filter is present on the MergeAppend subpath as the
sort cost added is to sort all tuples in the table rather than just the
ones expected the path to return.
This did not affect UNION ALL type queries as the RelOptInfo tuples is
set from the subquery's path rows. It does affect MergeAppends uses for
inheritance and partitioned tables.
This is a long-standing bug introduced when MergeAppend was first added
in 11cad29c9. No backpatch as this could result in plan changes.
Author: Alexander Kuzmenkov
Reviewed-by: Ashutosh Bapat, Aleksander Alekseev, David Rowley
Discussion: https://postgr.es/m/CALzhyqyhoXQDR-Usd_0HeWk%3DuqNLzoVeT8KhRoo%3DpV_KzgO3QQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to what was done in 5543677ec for non-parallel DISTINCT, apply
the same optimization when the distinct_pathkeys are empty for the
partial paths too.
This can be faster than the non-parallel version when the first row
matching the WHERE clause of the query takes a while to find. Parallel
workers could speed that process up considerably.
Author: Richard Guo
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAMbWs49JC0qvfUbzs-TVzgMpSSBiMJ_6sN=BaA9iohBgYkr=LA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we consolidate the generation of partial sort and partial incremental
sort paths in a similar way to what was done in 4a29eabd1. Since the cost
penalty for incremental sort was removed by that commit, there's no
point in creating a sort path on the cheapest partial path if an
incremental sort could be done instead.
This has the added benefit of reducing the amount of code required to
build these paths.
Author: Richard Guo
Reviewed-by: Etsuro Fujita, Shubham Khanna, David Rowley
Discussion: https://postgr.es/m/CAMbWs49PaKxBZU9cN7k3DKB7id+YfGfOfS9H_Fo5tkqPMt=fDg@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Oversight in 2489d76c4. Preliminary analysis suggests that the
problem may be unreachable --- but if we did have instances of
the same column with different varnullingrels, we'd surely need
to treat them as different Params.
Discussion: https://postgr.es/m/412552.1706203379@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible when determining the cache keys for a Memoize path that
if the same expr appeared twice in the parameterized path's ppi_clauses
and/or in the Nested Loop's inner relation's lateral_vars. If this
happened the Memoize node's cache keys would contain duplicates. This
isn't a problem for correctness, all it means is that the cache lookups
will be suboptimal due to having redundant work to do on every hash table
lookup and insert.
Here we adjust paraminfo_get_equal_hashops() to look for duplicates and
ignore them when we find them.
Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/422277.1706207562%40sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was possible in cases where we had a LATERAL joined subquery that
when the same Var is mentioned in both the lateral references and in the
outer Vars of the scan clauses that the given Var wouldn't be assigned
to the same NestLoopParam.
This could cause issues in Memoize as the cache key would reference the
Var for the scan clauses but when the parameter for the lateral references
changed some code in Memoize would see that some other parameter had
changed that's not part of the cache key and end up purging the entire
cache as a result, thinking the cache had become stale. This could
result in a Nested Loop -> Memoize plan being quite inefficient as, in
the worst case, the cache purging could result in never getting a cache
hit. In no cases could this problem lead to incorrect query results.
Here we switch the order of operations so that we create NestLoopParam
for the lateral references first before doing replace_nestloop_params().
replace_nestloop_params() will find and reuse the existing NestLoopParam
in cases where the Var exists in both locations.
Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAMbWs48XHJEK1Q1CzAQ7L9sTANTs9W1cepXu8%3DKc0quUL%2Btg4Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL. The evaluation of
these needless quals adds overhead. Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required. However, a recent bug
report indicates this might not always be possible.
Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice. That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.
Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.
Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all. This also works for join relations when the Var is not
nullable by any outer join.
This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.
Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was previously fixed in 9e215378d but got broken again as a result
of 2489d76c4. It seems that commit causes ppi_clauses to contain
duplicate clauses and it's no longer safe to check the list_length of
that list to determine if there are join conditions other than what's
mentioned in ppi_clauses.
Here we adjust the check to count the distinct rinfo_serial mentioned in
ppi_clauses. We expect that extra->restrictlist won't have duplicate
rinfo_serials.
Reported-by: Amadeo Gallardo
Author: Richard Guo
Discussion: https://postgr.es/m/CADFREbW-BLJd7-a5J%2B5wjVumeFG1ByXiSOFzMtkmY_SDWckTxw%40mail.gmail.com
Backpatch-through: 16, where 2489d76c4 was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When evaluating a query with a multi-column GROUP BY clause, we can minimize
sort operations or avoid them if we synchronize the order of GROUP BY clauses
with the ORDER BY sort clause or sort order, which comes from the underlying
query tree. Grouping does not imply any ordering, so we can compare
the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg,
we simply compared keys in the order specified in the query. This commit
explores alternative ordering of the keys, trying to find a cheaper one.
The ordering of group keys may interact with other parts of the query, some of
which may not be known while planning the grouping. For example, there may be
an explicit ORDER BY clause or some other ordering-dependent operation higher up
in the query, and using the same ordering may allow using either incremental
sort or even eliminating the sort entirely.
The patch always keeps the ordering specified in the query, assuming the user
might have additional insights.
This introduces a new GUC enable_group_by_reordering so that the optimization
may be disabled if needed.
Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru
Author: Andrei Lepikhov, Teodor Sigaev
Reviewed-by: Tomas Vondra, Claudio Freire, Gavin Flower, Dmitry Dolgov
Reviewed-by: Robert Haas, Pavel Borisov, David Rowley, Zhihong Yu
Reviewed-by: Tom Lane, Alexander Korotkov, Richard Guo, Alena Rybakina
|
|
|
|
|
|
|
| |
Extract the repetitive code pattern into a new function make_ordered_path().
Discussion: https://postgr.es/m/CAPpHfdtzaVa7S4onKy3YvttF2rrH5hQNHx9HtcSTLbpjx%2BMJ%2Bw%40mail.gmail.com
Author: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When removing a useless join, we'd remove PHVs that are not used at join
partner rels or above the join. A PHV that references the join's relid
in ph_eval_at is logically "above" the join and thus should not be
removed. We have the following check for that:
!bms_is_member(ojrelid, phinfo->ph_eval_at)
However, in the case of SJE removing a useless inner join, 'ojrelid' is
set to -1, which would trigger the "negative bitmapset member not
allowed" error in bms_is_member().
Fix it by skipping examining ojrelid for inner joins in this check.
Reported-by: Zuming Jiang
Bug: #18260
Discussion: https://postgr.es/m/18260-1b6a0c4ae311b837%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code for wrapping subquery output expressions in PlaceHolderVars
believed that if the expression already was a PlaceHolderVar, it was
never necessary to wrap that in another one. That's wrong if the
expression is underneath an outer join and involves a lateral
reference to outside that scope: failing to add an additional PHV
risks evaluating the expression at the wrong place and hence not
forcing it to null when the outer join should do so. This is an
oversight in commit 9e7e29c75, which added logic to forcibly wrap
lateral-reference Vars in PlaceHolderVars, but didn't see that the
adjacent case for PlaceHolderVars needed the same treatment.
The test case we have for this doesn't fail before 4be058fe9, but now
that I see the problem I wonder if it is possible to demonstrate
related errors before that. That's moot though, since all such
branches are out of support.
Per bug #18284 from Holger Reise. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18284-47505a20c23647f8@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were not testing the same thing as the comparable Assert
in calc_nestloop_required_outer(), because we neglected to map
the given Paths' relids to top-level relids. When considering
a partition child join the latter is the correct thing to do.
This oversight is old, but since it's only an overly-weak Assert
check there doesn't seem to be much value in back-patching.
Richard Guo (with cosmetic changes and comment updates by me)
Discussion: https://postgr.es/m/CAMbWs49sqbe9GBZ8sy8dSfKRNURgicR85HX8vgzcgQsPF0XY1w@mail.gmail.com
|
|
|
|
|
| |
Given that now SJE doesn't work with result relation, turn a code dealing with
that into an assert that it shouldn't happen.
|
|
|
|
|
|
|
|
|
|
| |
The target relation for INSERT/UPDATE/DELETE/MERGE has a different behavior
than other relations in EvalPlanQual() and RETURNING clause. This is why we
forbid target relation to be either source or target relation in SJE.
It's not clear if we could ever support this.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/b9e8f460-f9a6-0e9b-e8ba-60d59f0bc22c%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
When SJE uses RelOptInfo.unique_for_rels cache, it passes filtered quals to
innerrel_is_unique_ext(). That might lead to an invalid match to cache entries
made by previous non self-join checking calls. Add UniqueRelInfo.self_join
flag to prevent such cases. Also, fix that SJE should require a strict match
of outerrelids to make sure UniqueRelInfo.extra_clauses are valid.
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/4788f781-31bd-9796-d7d6-588a751c8787%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 599b33b94, this function assumed that every RTE_RELATION
RangeTblEntry would have an associated RelOptInfo. But that's not so:
we only build RelOptInfos for relations that are scanned by the query.
In particular the target of an INSERT won't have one, so that Vars
appearing in an INSERT ... RETURNING list will not have an associated
RelOptInfo. This apparently wasn't a problem before commit f7816aec2
taught examine_simple_variable() to drill down into CTEs containing
INSERT RETURNING, but it is now.
To fix, add a fallback code path that gets the userid to use directly
from the RTEPermissionInfo associated with the RTE. (Sadly, we must
have two code paths, because not every RTE has a RTEPermissionInfo
either.)
Per report from Alexander Lakhin. No back-patch, since the case is
apparently unreachable before f7816aec2.
Discussion: https://postgr.es/m/608a4886-6c60-0f9e-97d5-591256bd4150@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the SJE code handles the transfer of qual clauses from the removed
relation to the remaining one, it replaces the Vars of the removed
relation with the Vars of the remaining relation for each clause, and
then reintegrates these clauses into the appropriate restriction or join
clause lists, while attempting to avoid duplicates.
However, the code compares RestrictInfo->clause to determine if two
clauses are duplicates. This is just flat wrong. Two RestrictInfos
with the same clause can have different required_relids,
incompatible_relids, is_pushed_down, and so on. This can cause qual
clauses to be mistakenly omitted, leading to wrong results.
This patch fixes it by comparing the entire RestrictInfos not just their
clauses ignoring 'rinfo_serial' field (otherwise almost all RestrictInfos will
be unique). Making 'rinfo_serial' equal_ignore would break other code. This
is why this commit implements our own comparison function for checking the
equality of RestrictInfos.
Reported-by: Zuming Jiang
Bug: #18261
Discussion: https://postgr.es/m/18261-2a75d748c928609b%40postgresql.org
Author: Richard Guo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have DECHIST statistics about the argument expression, use
the average number of distinct elements as the array length estimate.
(It'd be better to use the average total number of elements, but
that is not currently calculated by compute_array_stats(), and
it's unclear that it'd be worth extra effort to get.)
To do this, we have to change the signature of estimate_array_length
to pass the "root" pointer. While at it, also change its result
type to "double". That's probably not really necessary, but it
avoids any risk of overflow of the value extracted from DECHIST.
All existing callers are going to use the result in a "double"
calculation anyway.
Paul Jungwirth, reviewed by Jian He and myself
Discussion: https://postgr.es/m/CA+renyUnM2d+SmrxKpDuAdpiq6FOM=FByvi6aS6yi__qyf6j9A@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
| |
Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions
by me.
Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
|
|
|
|
|
|
| |
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/a89f480f-8143-0965-f22d-0a892777f501%40gmail.com
Author: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
| |
There are a lot of situations when we share the same pointer to a Bitmapset
structure across different places. In order to evade undesirable side effects
replace_relid() function should always return a copy.
Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_wJthNtYBL%2BSsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw%40mail.gmail.com
Reviewed-by: Richard Guo, Andres Freund, Ashutosh Bapat, Andrei Lepikhov
|
|
|
|
|
|
| |
Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
| |
Self-join removal appears to be safe to apply with placeholder variables
as long as we handle PlaceHolderVar in replace_varno_walker() and replace
relid in phinfo->ph_lateral.
Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
| |
This commit also retires sje_walker. This increases the generalty of replacing
varno in the parse tree and simplifies the code.
Discussion: https://postgr.es/m/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's at least theoretically possible to overflow int32 when adding up
column width estimates to make a row width estimate. (The bug example
isn't terribly convincing as a real use-case, but perhaps wide joins
would provide a more plausible route to trouble.) This'd lead to
assertion failures or silly planner behavior. To forestall it, make
the relevant functions compute their running sums in int64 arithmetic
and then clamp to int32 range at the end. We can reasonably assume
that MaxAllocSize is a hard limit on actual tuple width, so clamping
to that is simply a correction for dubious input values, and there's
no need to go as far as widening width variables to int64 everywhere.
Per bug #18247 from RekGRpth. There've been no reports of this issue
arising in practical cases, so I feel no need to back-patch.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/18247-11ac477f02954422@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value was double in the original implementation of this logic.
Commit da08a6598 pulled it out into a subroutine, but carelessly
declared the parameter as int when it should have been double.
On most platforms, the only ill effect would be to clamp the value
to be not more than INT_MAX, which would seldom be exceeded and
probably wouldn't change the estimates too much anyway. Nonetheless,
it's wrong and can cause complaints from ubsan.
While here, improve the comments and parameter names.
This is an ABI change in a globally exposed subroutine, so
back-patching would create some risk of breaking extensions.
The value of the fix doesn't seem high enough to warrant taking
that risk, so fix in HEAD only.
Per report from Alexander Lakhin.
Discussion: https://postgr.es/m/f5e15fe1-202d-1936-f47c-f0c69a936b72@gmail.com
|
|
|
|
|
| |
Author: Richard Guo, Tom Lane
Discussion: https://www.postgresql.org/message-id/CAMbWs49rAfFS-yd7=QxtDUrZDFfRBGy4rGBJNyGDH7=CLipFPg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
00b41463c adjusted Bitmapset so that an empty set is always represented
as NULL. This makes checking for empty sets far cheaper than it used
to be.
There were various places in the code where we'd call bms_membership()
to handle the 3 possible BMS_Membership values. For the BMS_SINGLETON
case, we'd also call bms_singleton_member() to find the single set member.
This can now be done in a more optimal way by first checking if the set is
NULL and then not bothering with bms_membership() and simply call
bms_get_singleton_member() instead to find the single member. This
function will return false if there are multiple members in the set.
Here we also tidy up some logic in examine_variable() for the single
member case. There's now no need to call bms_is_member() as we've
already established that we're working with a singleton Bitmapset, so we
can just check if varRelid matches the singleton member.
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
|
|
|
|
|
|
|
|
|
|
| |
It seems that a PHV evaluated/needed at or below the self join should not have
a problem if we remove the self join. But this requires further investigation.
For now, we just do not remove self joins if the rel to be removed is laterally
referenced by PHVs.
Discussion: https://postgr.es/m/CAMbWs4-ns73VF9gi37q61G3dS6Xuos+HtryMaBh37WQn=BsaJw@mail.gmail.com
Author: Richard Guo
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can simplify FieldSelect on a whole-row Var into a plain Var
for the selected field. However, we should copy the whole-row Var's
varnullingrels when we do so, because the new Var is clearly nullable
by exactly the same rels as the original. Failure to do this led to
errors like "wrong varnullingrels (b) (expected (b 3)) for Var 2/2".
Richard Guo, per bug #18184 from Marian Krucina. Back-patch to
v16 where varnullingrels was introduced.
Discussion: https://postgr.es/m/18184-5868dd258782058e@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
| |
Add missing replacement of relids in phv->phexpr. Also, remove extra
replace_relid() over phv->phrels.
Reported-by: Zuming Jiang
Bug: #18187
Discussion: https://postgr.es/m/flat/18187-831da249cbd2ff8e%40postgresql.org
Author: Richard Guo
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
| |
Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_STsG1PKQBuvQC8W4sPo3KvML3=jOTjKLUYQuK3g8cpQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
recurse_set_operations() uses the parse tree for the group number estimation,
because of the "varno 0" hack. At the same time 2489d76c49 made root->parse
and corresponding parent_root->simple_rte_array[]->subquery distinct copies
of the parse tree, while d3d55ce571 introduced self-join removal replacing
relid of removed relation only in one of the copies.
The present commit fixes this bug by making recurse_set_operations() call
estimate_num_groups() with the copy of the parse tree processed by self-join
removal.
In future, we may think about maintaining just one copy of the parse tree
and/or keeping removed relids as aliases.
Reported-by: Zuming Jiang
Bug: #18170
Discussion: https://postgr.es/m/flat/18170-f1d17bf9a0d58b24%40postgresql.org
Author: Richard Guo, Alexander Korotkov
Reviewed-by: Andrei Lepikhov
|
|
|
|
|
|
|
|
|
|
| |
d3d55ce571 changed RelOptInfo.unique_for_rels from the list of Relid sets to
the list of UniqueRelInfo's. But it didn't make UniqueRelInfo a node.
This commit makes UniqueRelInfo a node. Also this commit revises some
comments related to RelOptInfo.unique_for_rels.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/flat/1189851.1698340331%40sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
| |
After 01575ad788e3, expand_single_inheritance_child()'s parentOID
variable is read only in an Assert, provoking a compiler warning in
non-assert builds. Fix that by marking the variable with
PG_USED_FOR_ASSERTS_ONLY.
Per report and suggestion from David Rowley
Discussion: https://postgr.es/m/CAApHDvpjA_8Wxu4DCTRVAvPxC9atwMe6N%2ByvrcGsgb7mrfdpJA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since C99, there can be a trailing comma after the last value in an
enum definition. A lot of new code has been introducing this style on
the fly. Some new patches are now taking an inconsistent approach to
this. Some add the last comma on the fly if they add a new last
value, some are trying to preserve the existing style in each place,
some are even dropping the last comma if there was one. We could
nudge this all in a consistent direction if we just add the trailing
commas everywhere once.
I omitted a few places where there was a fixed "last" value that will
always stay last. I also skipped the header files of libpq and ecpg,
in case people want to use those with older compilers. There were
also a small number of cases where the enum type wasn't used anywhere
(but the enum values were), which ended up confusing pgindent a bit,
so I left those alone.
Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, expand_single_inheritance_child() doesn't reset
perminfoindex in a plain-inheritance parent's child RTE, because
prior to 387f9ed0a0, the executor would use the first child RTE to
locate the parent's RTEPermissionInfo. That in turn causes
add_rte_to_flat_rtable() to create an extra RTEPermissionInfo
belonging to the parent's child RTE with the same content as the one
belonging to the parent's original ("root") RTE.
In 387f9ed0a0, we changed things so that the executor can now use the
parent's "root" RTE for locating its RTEPermissionInfo instead of the
child RTE, so the latter's perminfoindex need not be set anymore, so
make it so.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/839708.1698174464@sss.pgh.pa.us
Backpatch-through: 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Self Join Elimination (SJE) feature removes an inner join of a plain table
to itself in the query tree if is proved that the join can be replaced with
a scan without impacting the query result. Self join and inner relation are
replaced with the outer in query, equivalence classes, and planner info
structures. Also, inner restrictlist moves to the outer one with removing
duplicated clauses. Thus, this optimization reduces the length of the range
table list (this especially makes sense for partitioned relations), reduces
the number of restriction clauses === selectivity estimations, and potentially
can improve total planner prediction for the query.
The SJE proof is based on innerrel_is_unique machinery.
We can remove a self-join when for each outer row:
1. At most one inner row matches the join clause.
2. Each matched inner row must be (physically) the same row as the outer one.
In this patch we use the next approach to identify a self-join:
1. Collect all merge-joinable join quals which look like a.x = b.x
2. Add to the list above the baseretrictinfo of the inner table.
3. Check innerrel_is_unique() for the qual list. If it returns false, skip
this pair of joining tables.
4. Check uniqueness, proved by the baserestrictinfo clauses. To prove
the possibility of self-join elimination inner and outer clauses must have
an exact match.
The relation replacement procedure is not trivial and it is partly combined
with the one, used to remove useless left joins. Tests, covering this feature,
were added to join.sql. Some regression tests changed due to self-join removal
logic.
Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru
Author: Andrey Lepikhov, Alexander Kuzmenkov
Reviewed-by: Tom Lane, Robert Haas, Andres Freund, Simon Riggs, Jonathan S. Katz
Reviewed-by: David Rowley, Thomas Munro, Konstantin Knizhnik, Heikki Linnakangas
Reviewed-by: Hywel Carver, Laurenz Albe, Ronan Dunklau, vignesh C, Zhihong Yu
Reviewed-by: Greg Stark, Jaime Casanova, Michał Kłeczek, Alena Rybakina
Reviewed-by: Alexander Korotkov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an UPDATE/DELETE/MERGE's target table is an old-style
inheritance tree, it's possible for the parent to get excluded
from the plan while some children are not. (I believe this is
only possible if we can prove that a CHECK ... NO INHERIT
constraint on the parent contradicts the query WHERE clause,
so it's a very unusual case.) In such a case, ExecInitModifyTable
mistakenly concluded that the first surviving child is the target
table, leading to at least two bugs:
1. The wrong table's statement-level triggers would get fired.
2. In v16 and up, it was possible to fail with "invalid perminfoindex
0 in RTE with relid nnnn" due to the child RTE not having permissions
data included in the query plan. This was hard to reproduce reliably
because it did not occur unless the update triggered some non-HOT
index updates.
In v14 and up, this is easy to fix by defining ModifyTable.rootRelation
to be the parent RTE in plain inheritance as well as partitioned cases.
While the wrong-triggers bug also appears in older branches, the
relevant code in both the planner and executor is quite a bit
different, so it would take a good deal of effort to develop and
test a suitable patch. Given the lack of field complaints about the
trigger issue, I'll desist for now. (Patching v11 for this seems
unwise anyway, given that it will have no more releases after next
month.)
Per bug #18147 from Hans Buschmann.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/18147-6fc796538913ee88@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 3fc6e2d7f, I (tgl) argued that we only need to check for
a constant-FALSE restriction clause when there's exactly one
restriction clause, on the grounds that const-folding would have
thrown away anything ANDed with a Const FALSE. That's true just after
const-folding has been applied, but subsequent processing such as
equivalence class expansion could result in cases where a Const FALSE
is ANDed with some other stuff. (Compare for instance joinrels.c's
restriction_is_constant_false.) Hence, tweak this logic to check all
the elements of the baserestrictinfo list, not just one; that's cheap
enough to not be worth worrying about.
There is one existing test case where this visibly improves the plan.
There would not be any savings in runtime, but the planner effort and
executor startup effort will be reduced, and anyway it's odd that
we can detect related cases but not this one.
Richard Guo (independently discovered by David Rowley)
Discussion: https://postgr.es/m/CAMbWs4_x3-CnVVrCboS1LkEhB5V+W7sLSCabsRiG+n7+5_kqbg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we added the PlannerInfo.all_baserels set, it's not really
necessary to grovel over the rangetable to count baserels in the
current query. So let's drop has_multiple_baserels() in favor
of a bms_membership() test. This might be microscopically
faster, but the main point is to remove some unnecessary code.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_8RcSbbfs1ASZLrMuL0c0EQgXWcoLTQD8swBRY_pQQiA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While working on a8a968a82, I failed to consider that
cheapest_startup_path can be NULL when there is no non-parameterized
path in the pathlist. This is well documented in set_cheapest(), I just
failed to notice.
Here we adjust the code to just check if the RelOptInfo has a
cheapest_startup_path set before adding it to the startup_subpaths list.
Reported-by: Richard Guo
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs49w3t03V69XhdCuw+GDwivny4uQUxrkVp6Gejaspt0wMQ@mail.gmail.com
|