aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/plan/analyzejoins.c
Commit message (Collapse)AuthorAge
* Allow rel_is_distinct_for() to look through RelabelType below OpExpr.Tom Lane2017-09-17
| | | | | | | | | This lets it do the right thing for, eg, varchar columns. Back-patch to 9.5 where this logic appeared. David Rowley, per report from Kim Rose Carlsen Discussion: https://postgr.es/m/VI1PR05MB17091F9A9876528055D6A827C76D0@VI1PR05MB1709.eurprd05.prod.outlook.com
* Restore foreign-key-aware estimation of join relation sizes.Tom Lane2016-06-18
| | | | | | | | | | | | | | | | | | | | This patch provides a new implementation of the logic added by commit 137805f89 and later removed by 77ba61080. It differs from the original primarily in expending much less effort per joinrel in large queries, which it accomplishes by doing most of the matching work once per query not once per joinrel. Hopefully, it's also less buggy and better commented. The never-documented enable_fkey_estimates GUC remains gone. There remains work to be done to make the selectivity estimates account for nulls in FK referencing columns; but that was true of the original patch as well. We may be able to address this point later in beta. In the meantime, any error should be in the direction of overestimating rather than underestimating joinrel sizes, which seems like the direction we want to err in. Tomas Vondra and Tom Lane Discussion: <31041.1465069446@sss.pgh.pa.us>
* Refactor join_is_removable() to separate out distinctness-proving logic.Tom Lane2016-04-07
| | | | | | | | | | | | | Extracted from pending unique-join patch, since this is a rather large delta but it's simply moving code out into separately-accessible subroutines. I (tgl) did choose to add a bit more logic to rel_supports_distinctness, so that it verifies that there's at least one potentially usable unique index rather than just checking indexlist != NIL. Otherwise there's no functional change here. David Rowley
* Refactor pull_var_clause's API to make it less tedious to extend.Tom Lane2016-03-10
| | | | | | | | | | | | | | | | | | | | In commit 1d97c19a0f748e94 and later c1d9579dd8bf3c92, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Get rid of the planner's LateralJoinInfo data structure.Tom Lane2015-12-11
| | | | | | | | | | | | | | | | | | | | I originally modeled this data structure on SpecialJoinInfo, but after commit acfcd45cacb6df23 that looks like a pretty poor decision. All we really need is relid sets identifying laterally-referenced rels; and most of the time, what we want to know about includes indirect lateral references, a case the LateralJoinInfo data was unsuited to compute with any efficiency. The previous commit redefined RelOptInfo.lateral_relids as the transitive closure of lateral references, so that it easily supports checking indirect references. For the places where we really do want just direct references, add a new RelOptInfo field direct_lateral_relids, which is easily set up as a copy of lateral_relids before we perform the transitive closure calculation. Then we can just drop lateral_info_list and LateralJoinInfo and the supporting code. This makes the planner's handling of lateral references noticeably more efficient, and shorter too. Such a change can't be back-patched into stable branches for fear of breaking extensions that might be looking at the planner's data structures; but it seems not too late to push it into 9.5, so I've done so.
* Further adjustments to PlaceHolderVar removal.Tom Lane2015-08-07
| | | | | | | | | | | | | | | | | | | | | | | | A new test case from Andreas Seltenreich showed that we were still a bit confused about removing PlaceHolderVars during join removal. Specifically, remove_rel_from_query would remove a PHV that was used only underneath the removable join, even if the place where it's used was the join partner relation and not the join clause being deleted. This would lead to a "too late to create a new PlaceHolderInfo" error later on. We can defend against that by checking ph_eval_at to see if the PHV could possibly be getting used at some partner rel. Also improve some nearby LATERAL-related logic. I decided that the check on ph_lateral needed to take precedence over the check on ph_needed, in case there's a lateral reference underneath the join being considered. (That may be impossible, but I'm not convinced of it, and it's easy enough to defend against the case.) Also, I realized that remove_rel_from_query's logic for updating LateralJoinInfos is dead code, because we don't build those at all until after join removal. Back-patch to 9.3. Previous versions didn't have the LATERAL issues, of course, and they also didn't attempt to remove PlaceHolderInfos during join removal. (I'm starting to wonder if changing that was really such a great idea.)
* Fix old oversight in join removal logic.Tom Lane2015-08-06
| | | | | | | | | | | | | Commit 9e7e29c75ad441450f9b8287bd51c13521641e3b introduced an Assert that join removal didn't reduce the eval_at set of any PlaceHolderVar to empty. At first glance it looks like join_is_removable ensures that's true --- but actually, the loop in join_is_removable skips PlaceHolderVars that are not referenced above the join due to be removed. So, if we don't want any empty eval_at sets, the right thing to do is to delete any now-unreferenced PlaceHolderVars from the data structure entirely. Per fuzz testing by Andreas Seltenreich. Back-patch to 9.3 where the aforesaid Assert was added.
* pgindent run for 9.5Bruce Momjian2015-05-23
|
* Support GROUPING SETS, CUBE and ROLLUP.Andres Freund2015-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* Add bms_get_singleton_member(), and use it where appropriate.Tom Lane2014-11-28
| | | | | | | | | | This patch adds a function that replaces a bms_membership() test followed by a bms_singleton_member() call, performing both the test and the extraction of a singleton set's member in one scan of the bitmapset. The performance advantage over the old way is probably minimal in current usage, but it seems worthwhile on notational grounds anyway. David Rowley
* Allow join removal in some cases involving a left join to a subquery.Tom Lane2014-07-15
| | | | | | | | | | | | | | | | | | | We can remove a left join to a relation if the relation's output is provably distinct for the columns involved in the join clause (considering only equijoin clauses) and the relation supplies no variables needed above the join. Previously, the join removal logic could only prove distinctness by reference to unique indexes of a table. This patch extends the logic to consider subquery relations, wherein distinctness might be proven by reference to GROUP BY, DISTINCT, etc. We actually already had some code to check that a subquery's output was provably distinct, but it was hidden inside pathnode.c; which was a pretty bad place for it really, since that file is mostly boilerplate Path construction and comparison. Move that code to analyzejoins.c, which is arguably a more appropriate location, and is certainly the site of the new usage for it. David Rowley, reviewed by Simon Riggs
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Update copyright for 2014Bruce Momjian2014-01-07
| | | | | Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
* Fix planner problems with LATERAL references in PlaceHolderVars.Tom Lane2013-08-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The planner largely failed to consider the possibility that a PlaceHolderVar's expression might contain a lateral reference to a Var coming from somewhere outside the PHV's syntactic scope. We had a previous report of a problem in this area, which I tried to fix in a quick-hack way in commit 4da6439bd8553059766011e2a42c6e39df08717f, but Antonin Houska pointed out that there were still some problems, and investigation turned up other issues. This patch largely reverts that commit in favor of a more thoroughly thought-through solution. The new theory is that a PHV's ph_eval_at level cannot be higher than its original syntactic level. If it contains lateral references, those don't change the ph_eval_at level, but rather they create a lateral-reference requirement for the ph_eval_at join relation. The code in joinpath.c needs to handle that. Another issue is that createplan.c wasn't handling nested PlaceHolderVars properly. In passing, push knowledge of lateral-reference checks for join clauses into join_clause_is_movable_to. This is mainly so that FDWs don't need to deal with it. This patch doesn't fix the original join-qual-placement problem reported by Jeremy Evans (and indeed, one of the new regression test cases shows the wrong answer because of that). But the PlaceHolderVar problems need to be fixed before that issue can be addressed, so committing this separately seems reasonable.
* Remove ph_may_need from PlaceHolderInfo, with attendant simplifications.Tom Lane2013-08-14
| | | | | | | | | | | | | | | | | | | | | The planner logic that attempted to make a preliminary estimate of the ph_needed levels for PlaceHolderVars seems to be completely broken by lateral references. Fortunately, the potential join order optimization that this code supported seems to be of relatively little value in practice; so let's just get rid of it rather than trying to fix it. Getting rid of this allows fairly substantial simplifications in placeholder.c, too, so planning in such cases should be a bit faster. Issue noted while pursuing bugs reported by Jeremy Evans and Antonin Houska, though this doesn't in itself fix either of their reported cases. What this does do is prevent an Assert crash in the kind of query illustrated by the added regression test. (I'm not sure that the plan for that query is stable enough across platforms to be usable as a regression test output ... but we'll soon find out from the buildfarm.) Back-patch to 9.3. The problem case can't arise without LATERAL, so no need to touch older branches.
* pgindent run for release 9.3Bruce Momjian2013-05-29
| | | | | This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Fix up planner infrastructure to support LATERAL properly.Tom Lane2012-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch takes care of a number of problems having to do with failure to choose valid join orders and incorrect handling of lateral references pulled up from subqueries. Notable changes: * Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to represent join ordering constraints created by lateral references. (I first considered extending the SpecialJoinInfo structure, but the semantics are different enough that a separate data structure seems better.) Extend join_is_legal() and related functions to prevent trying to form unworkable joins, and to ensure that we will consider joins that satisfy lateral references even if the joins would be clauseless. * Fill in the infrastructure needed for the last few types of relation scan paths to support parameterization. We'd have wanted this eventually anyway, but it is necessary now because a relation that gets pulled up out of a UNION ALL subquery may acquire a reltargetlist containing lateral references, meaning that its paths *have* to be parameterized whether or not we have any code that can push join quals down into the scan. * Compute data about lateral references early in query_planner(), and save in RelOptInfo nodes, to avoid repetitive calculations later. * Assorted corner-case bug fixes. There's probably still some bugs left, but this is a lot closer to being real than it was before.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Improve planner's ability to recognize cases where an IN's RHS is unique.Tom Lane2011-10-26
| | | | | | | | | | | | If the right-hand side of a semijoin is unique, then we can treat it like a normal join (or another way to say that is: we don't need to explicitly unique-ify the data before doing it as a normal join). We were recognizing such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP BY decoration, but there's another way: if the RHS is a plain relation with unique indexes, we can check if any of the indexes prove the output is unique. Most of the infrastructure for that was there already in the join removal code, though I had to rearrange it a bit. Per reflection about a recent example in pgsql-performance.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Fix PlaceHolderVar mechanism's interaction with outer joins.Tom Lane2010-09-28
| | | | | | | | | | | | | | The point of a PlaceHolderVar is to allow a non-strict expression to be evaluated below an outer join, after which its value bubbles up like a Var and can be forced to NULL when the outer join's semantics require that. However, there was a serious design oversight in that, namely that we didn't ensure that there was actually a correct place in the plan tree to evaluate the placeholder :-(. It may be necessary to delay evaluation of an outer join to ensure that a placeholder that should be evaluated below the join can be evaluated there. Per recent bug report from Kirill Simonov. Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced.
* Fix another join removal bug: the check on PlaceHolderVars was wrong.Tom Lane2010-09-25
| | | | | | | | The previous coding would decide that join removal was unsafe upon finding a PlaceHolderVar that needed to be evaluated at the inner rel and then used above the join. However, this fails to cover the case of PlaceHolderVars that refer to both the inner rel and some other rels. Per bug report from Andrus.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Fix join-removal logic for pseudoconstant and outerjoin-delayed quals.Tom Lane2010-09-14
| | | | | | | | | | | | | | | | | | In these cases a qual can get marked with the removable rel in its required_relids, but this is just to schedule its evaluation correctly, not because it really depends on the rel. We were assuming that, in effect, we could throw away *all* quals so marked, which is nonsense. Tighten up the logic to be a little more paranoid about which quals belong to the outer join being considered for removal, and arrange for all quals that don't belong to be updated so they will still get evaluated correctly. Also fix another problem that happened to be exposed by this test case, which was that make_join_rel() was failing to notice some cases where a constant-false qual could be used to prove a join relation empty. If it's a pushed-down constant false, then the relation is empty even if it's an outer join, because the qual applies after the outer join expansion. Per report from Nathan Grange. Back-patch into 9.0.
* pgindent run for 9.0, second runBruce Momjian2010-07-06
|
* Fix oversight in join removal patch: we have to delete the removed relationTom Lane2010-05-23
| | | | from SpecialJoinInfo relid sets as well. Per example from Vaclav Novotny.
* Rework join-removal logic as per recent discussion. In particular thisTom Lane2010-03-28
fixes things so that it works for cases where nested removals are possible. The overhead of the optimization should be significantly less, as well.