diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2023-01-30 13:16:20 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2023-01-30 13:16:20 -0500 |
commit | 2489d76c4906f4461a364ca8ad7e0751ead8aa0d (patch) | |
tree | 145ebc28d5ea8f5a5ba340b9e353a11de786adae /src/backend/optimizer/path/joinpath.c | |
parent | ec7e053a98f39a9e3c7e6d35f0d2e83933882399 (diff) | |
download | postgresql-2489d76c4906f4461a364ca8ad7e0751ead8aa0d.tar.gz postgresql-2489d76c4906f4461a364ca8ad7e0751ead8aa0d.zip |
Make Vars be outer-join-aware.
Traditionally we used the same Var struct to represent the value
of a table column everywhere in parse and plan trees. This choice
predates our support for SQL outer joins, and it's really a pretty
bad idea with outer joins, because the Var's value can depend on
where it is in the tree: it might go to NULL above an outer join.
So expression nodes that are equal() per equalfuncs.c might not
represent the same value, which is a huge correctness hazard for
the planner.
To improve this, decorate Var nodes with a bitmapset showing
which outer joins (identified by RTE indexes) may have nulled
them at the point in the parse tree where the Var appears.
This allows us to trust that equal() Vars represent the same value.
A certain amount of klugery is still needed to cope with cases
where we re-order two outer joins, but it's possible to make it
work without sacrificing that core principle. PlaceHolderVars
receive similar decoration for the same reason.
In the planner, we include these outer join bitmapsets into the relids
that an expression is considered to depend on, and in consequence also
add outer-join relids to the relids of join RelOptInfos. This allows
us to correctly perceive whether an expression can be calculated above
or below a particular outer join.
This change affects FDWs that want to plan foreign joins. They *must*
follow suit when labeling foreign joins in order to match with the
core planner, but for many purposes (if postgres_fdw is any guide)
they'd prefer to consider only base relations within the join.
To support both requirements, redefine ForeignScan.fs_relids as
base+OJ relids, and add a new field fs_base_relids that's set up by
the core planner.
Large though it is, this commit just does the minimum necessary to
install the new mechanisms and get check-world passing again.
Follow-up patches will perform some cleanup. (The README additions
and comments mention some stuff that will appear in the follow-up.)
Patch by me; thanks to Richard Guo for review.
Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
Diffstat (limited to 'src/backend/optimizer/path/joinpath.c')
-rw-r--r-- | src/backend/optimizer/path/joinpath.c | 65 |
1 files changed, 61 insertions, 4 deletions
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index d345c0437a4..dfbb839be16 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -234,7 +234,9 @@ add_paths_to_joinrel(PlannerInfo *root, * reduces the number of parameterized paths we have to deal with at * higher join levels, without compromising the quality of the resulting * plan. We express the restriction as a Relids set that must overlap the - * parameterization of any proposed join path. + * parameterization of any proposed join path. Note: param_source_rels + * should contain only baserels, not OJ relids, so starting from + * all_baserels not all_query_rels is correct. */ foreach(lc, root->join_info_list) { @@ -366,6 +368,60 @@ allow_star_schema_join(PlannerInfo *root, } /* + * If the parameterization is only partly satisfied by the outer rel, + * the unsatisfied part can't include any outer-join relids that could + * null rels of the satisfied part. That would imply that we're trying + * to use a clause involving a Var with nonempty varnullingrels at + * a join level where that value isn't yet computable. + * + * In practice, this test never finds a problem because earlier join order + * restrictions prevent us from attempting a join that would cause a problem. + * (That's unsurprising, because the code worked before we ever added + * outer-join relids to expression relids.) It still seems worth checking + * as a backstop, but we don't go to a lot of trouble: just reject if the + * unsatisfied part includes any outer-join relids at all. + */ +static inline bool +have_unsafe_outer_join_ref(PlannerInfo *root, + Relids outerrelids, + Relids inner_paramrels) +{ + bool result = false; + Relids unsatisfied = bms_difference(inner_paramrels, outerrelids); + + if (unlikely(bms_overlap(unsatisfied, root->outer_join_rels))) + { +#ifdef NOT_USED + /* If we ever weaken the join order restrictions, we might need this */ + ListCell *lc; + + foreach(lc, root->join_info_list) + { + SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) lfirst(lc); + + if (!bms_is_member(sjinfo->ojrelid, unsatisfied)) + continue; /* not relevant */ + if (bms_overlap(inner_paramrels, sjinfo->min_righthand) || + (sjinfo->jointype == JOIN_FULL && + bms_overlap(inner_paramrels, sjinfo->min_lefthand))) + { + result = true; /* doesn't work */ + break; + } + } +#else + /* For now, if we do see an overlap, just assume it's trouble */ + result = true; +#endif + } + + /* Waste no memory when we reject a path here */ + bms_free(unsatisfied); + + return result; +} + +/* * paraminfo_get_equal_hashops * Determine if param_info and innerrel's lateral_vars can be hashed. * Returns true the hashing is possible, otherwise return false. @@ -657,15 +713,16 @@ try_nestloop_path(PlannerInfo *root, /* * Check to see if proposed path is still parameterized, and reject if the * parameterization wouldn't be sensible --- unless allow_star_schema_join - * says to allow it anyway. Also, we must reject if have_dangerous_phv - * doesn't like the look of it, which could only happen if the nestloop is - * still parameterized. + * says to allow it anyway. Also, we must reject if either + * have_unsafe_outer_join_ref or have_dangerous_phv don't like the look of + * it, which could only happen if the nestloop is still parameterized. */ required_outer = calc_nestloop_required_outer(outerrelids, outer_paramrels, innerrelids, inner_paramrels); if (required_outer && ((!bms_overlap(required_outer, extra->param_source_rels) && !allow_star_schema_join(root, outerrelids, inner_paramrels)) || + have_unsafe_outer_join_ref(root, outerrelids, inner_paramrels) || have_dangerous_phv(root, outerrelids, inner_paramrels))) { /* Waste no memory when we reject a path here */ |