aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/path/pathkeys.c
Commit message (Collapse)AuthorAge
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Clean up a few failures to set collation fields in expression nodes.Tom Lane2011-03-26
| | | | | | | | | I'm not sure these have any non-cosmetic implications, but I'm not sure they don't, either. In particular, ensure the CaseTestExpr generated by transformAssignmentIndirection to represent the base target column carries the correct collation, because parse_collate.c won't fix that. Tweak lsyscache.c API so that we can get the appropriate collation without an extra syscache lookup.
* Reimplement planner's handling of MIN/MAX aggregate optimization (again).Tom Lane2011-03-22
| | | | | | | | | | | | | | Instead of playing cute games with pathkeys, just build a direct representation of the intended sub-select, and feed it through query_planner to get a Path for the index access. This is a bit slower than 9.1's previous method, since we'll duplicate most of the overhead of query_planner; but since the whole optimization only applies to rather simple single-table queries, that probably won't be much of a problem in practice. The advantage is that we get to do the right thing when there's a partial index that needs the implicit IS NOT NULL clause to be usable. Also, although this makes planagg.c be a bit more closely tied to the ordering of operations in grouping_planner, we can get rid of some coupling to lower-level parts of the planner. Per complaint from Marti Raudsepp.
* Revise collation derivation method and expression-tree representation.Tom Lane2011-03-19
| | | | | | | | | | | | | | | | | | | All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.
* Per-column collation supportPeter Eisentraut2011-02-08
| | | | | | | | This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Simplify and speed up mapping of index opfamilies to pathkeys.Tom Lane2010-11-29
| | | | | | | | | | | | | | | | | | | | | | Formerly we looked up the operators associated with each index (caching them in relcache) and then the planner looked up the btree opfamily containing such operators in order to build the btree-centric pathkey representation that describes the index's sort order. This is quite pointless for btree indexes: we might as well just use the index's opfamily information directly. That saves syscache lookup cycles during planning, and furthermore allows us to eliminate the relcache's caching of operators altogether, which may help in reducing backend startup time. I added code to plancat.c to perform the same type of double lookup on-the-fly if it's ever faced with a non-btree amcanorder index AM. If such a thing actually becomes interesting for production, we should replace that logic with some more-direct method for identifying the corresponding btree opfamily; but it's not worth spending effort on now. There is considerably more to do pursuant to my recent proposal to get rid of sort-operator-based representations of sort orderings, but this patch grabs some of the low-hanging fruit. I'll look at the remainder of that work after the current commitfest.
* Reimplement planner's handling of MIN/MAX aggregate optimization.Tom Lane2010-11-04
| | | | | | | | | | | | | | | | | | | | | | Per my recent proposal, get rid of all the direct inspection of indexes and manual generation of paths in planagg.c. Instead, set up EquivalenceClasses for the aggregate argument expressions, and let the regular path generation logic deal with creating paths that can satisfy those sort orders. This makes planagg.c a bit more visible to the rest of the planner than it was originally, but the approach is basically a lot cleaner than before. A major advantage of doing it this way is that we get MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans) practically for free, whereas in the old way we'd have had to add a whole lot more duplicative logic. One small disadvantage of this approach is that MIN/MAX aggregates can no longer exploit partial indexes having an "x IS NOT NULL" predicate, unless that restriction or something that implies it is specified in the query. The previous implementation was able to use the added "x IS NOT NULL" condition as an extra predicate proof condition, but in this version we rely entirely on indexes that are considered usable by the main planning process. That seems a fair tradeoff for the simplicity and functionality gained.
* Avoid creation of useless EquivalenceClasses during planning.Tom Lane2010-10-29
| | | | | | | | | | | | | | | | | | | | | | Zoltan Boszormenyi exhibited a test case in which planning time was dominated by construction of EquivalenceClasses and PathKeys that had no actual relevance to the query (and in fact got discarded immediately). This happened because we generated PathKeys describing the sort ordering of every index on every table in the query, and only after that checked to see if the sort ordering was relevant. The EC/PK construction code is O(N^2) in the number of ECs, which is all right for the intended number of such objects, but it gets out of hand if there are ECs for lots of irrelevant indexes. To fix, twiddle the handling of mergeclauses a little bit to ensure that every interesting EC is created before we begin path generation. (This doesn't cost anything --- in fact I think it's a bit cheaper than before --- since we always eventually created those ECs anyway.) Then, if an index column can't be found in any pre-existing EC, we know that that sort ordering is irrelevant for the query. Instead of creating a useless EC, we can just not build a pathkey for the index column in the first place. The index will still be considered if it's useful for non-order-related reasons, but we will think of its output as unsorted.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Small refactoring of makeVar() from a TargetEntryPeter Eisentraut2010-08-27
|
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Fix assertion failure when a SELECT DISTINCT ON expression is volatile.Tom Lane2009-09-12
| | | | | | | | | | | | | | In this case we generate two PathKey references to the expression (one for DISTINCT and one for ORDER BY) and they really need to refer to the same EquivalenceClass. However get_eclass_for_sort_expr was being overly paranoid and creating two different EC's. Correct behavior is to use the SortGroupRef index to decide whether two references to volatile expressions that are equal() (ie textually equivalent) should be considered the same. Backpatch to 8.4. Possibly this should be changed in 8.3 as well, but I'll refrain in the absence of evidence of a visible failure in that branch. Per bug #5049.
* Repair bug #4926 "too few pathkeys for mergeclauses". This example showsTom Lane2009-07-17
| | | | | | | | | that the sanity checking I added to create_mergejoin_plan() in 8.3 was a few bricks shy of a load: the mergeclauses could reference pathkeys in a noncanonical order such as x,y,x, not only cases like x,x,y which is all that the code had allowed for. The odd cases only turn up when using redundant clauses in an outer join condition, which is why no one had noticed before.
* Shave a few cycles in compare_pathkeys() by checking for pointer-identicalTom Lane2009-02-28
| | | | | | input lists before we grovel through the lists. This doesn't save much, but testing shows that the case of both inputs NIL is common enough that it saves something. And this is used enough to be a hotspot.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Move exprType(), exprTypmod(), expression_tree_walker(), and related routinesTom Lane2008-08-25
| | | | | | into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside the backend. There's probably more that should be done along this line, but this is a start anyway.
* Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT itemsTom Lane2008-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | as per my recent proposal: 1. Fold SortClause and GroupClause into a single node type SortGroupClause. We were already relying on them to be struct-equivalent, so using two node tags wasn't accomplishing much except to get in the way of comparing items with equal(). 2. Add an "eqop" field to SortGroupClause to carry the associated equality operator. This is cheap for the parser to get at the same time it's looking up the sort operator, and storing it eliminates the need for repeated not-so-cheap lookups during planning. In future this will also let us represent GROUP/DISTINCT operations on datatypes that have hash opclasses but no btree opclasses (ie, they have equality but no natural sort order). The previous representation simply didn't work for that, since its only indicator of comparison semantics was a sort operator. 3. Add a hasDistinctOn boolean to struct Query to explicitly record whether the distinctClause came from DISTINCT or DISTINCT ON. This allows removing some complicated and not 100% bulletproof code that attempted to figure that out from the distinctClause alone. This patch doesn't in itself create any new capability, but it's necessary infrastructure for future attempts to use hash-based grouping for DISTINCT and UNION/INTERSECT/EXCEPT.
* Fix some planner issues found while investigating Kevin Grittner's reportTom Lane2008-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of poorer planning in 8.3 than 8.2: 1. After pushing a constant across an outer join --- ie, given "a LEFT JOIN b ON (a.x = b.y) WHERE a.x = 42", we can deduce that b.y is sort of equal to 42, in the sense that we needn't fetch any b rows where it isn't 42 --- loop to see if any additional deductions can be made. Previous releases did that by recursing, but I had mistakenly thought that this was no longer necessary given the EquivalenceClass machinery. 2. Allow pushing constants across outer join conditions even if the condition is outerjoin_delayed due to a lower outer join. This is safe as long as the condition is strict and we re-test it at the upper join. 3. Keep the outer-join clause even if we successfully push a constant across it. This is *necessary* in the outerjoin_delayed case, but even in the simple case, it seems better to do this to ensure that the join search order heuristics will consider the join as reasonable to make. Mark such a clause as having selectivity 1.0, though, since it's not going to eliminate very many rows after application of the constant condition. 4. Tweak have_relevant_eclass_joinclause to report that two relations are joinable when they have vars that are equated to the same constant. We won't actually generate any joinclause from such an EquivalenceClass, but again it seems that in such a case it's a good idea to consider the join as worth costing out. 5. Fix a bug in select_mergejoin_clauses that was exposed by these changes: we have to reject candidate mergejoin clauses if either side was equated to a constant, because we can't construct a canonical pathkey list for such a clause. This is an implementation restriction that might be worth fixing someday, but it doesn't seem critical to get it done for 8.3.
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* Re-run pgindent with updated list of typedefs. (Updated README shouldBruce Momjian2007-11-15
| | | | avoid this problem in the future.)
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Fix EquivalenceClass code to handle volatile sort expressions in a moreTom Lane2007-11-08
| | | | | | | | | | | | | | predictable manner; in particular that if you say ORDER BY output-column-ref, it will in fact sort by that specific column even if there are multiple syntactic matches. An example is SELECT random() AS a, random() AS b FROM ... ORDER BY b, a; While the use-case for this might be a bit debatable, it worked as expected in earlier releases, so we should preserve the behavior for 8.3. Per my recent proposal. While at it, fix convert_subquery_pathkeys() to handle RelabelType stripping in both directions; it needs this for the same reasons make_sort_from_pathkeys does.
* Last week's patch for make_sort_from_pathkeys wasn't good enough: it hasTom Lane2007-11-08
| | | | | | | to be able to discard top-level RelabelType nodes on *both* sides of the equivalence-class-to-target-list comparison, since make_pathkey_from_sortinfo might either add or remove a RelabelType. Also fix the latter to do the removal case cleanly. Per example from Peter.
* Ensure that EquivalenceClasses generated from ORDER BY keys contain properTom Lane2007-11-02
| | | | | | | | | RelabelType nodes when the sort key is binary-compatible with the sort operator rather than having exactly its input type. We did this correctly for index columns but not sort keys, leading to failure to notice that a varchar index matches an ORDER BY request. This requires a bit more work in make_sort_from_pathkeys, but not anyplace else that I can find. Per bug report and subsequent discussion.
* Avoid considering both sort directions as equally useful for merging.Tom Lane2007-10-27
| | | | | | | | | This doubles the planning workload for mergejoins while not actually accomplishing much. The only useful case is where one of the directions matches the query's ORDER BY request; therefore, put a thumb on the scales in that direction, and otherwise arbitrarily consider only the ASC direction. (This is a lot easier now than it would've been before 8.3, since we have more semantic knowledge embedded in PathKeys now.)
* Change build_index_pathkeys() so that the expressions it builds to representTom Lane2007-05-31
| | | | | | | | | | | | | index key columns always have the type expected by the index's associated operators, ie, we add RelabelType nodes when dealing with binary-compatible index opclasses. This is needed to get varchar indexes to play nicely with the new EquivalenceClass machinery, as per recent gripe from Josh Berkus that CVS HEAD was failing to match a varchar index column to a constant restriction in the query. It seems likely that this change will allow removal of a lot of ugly ad-hoc RelabelType-stripping that the planner has traditionally done while matching expressions to other expressions, but I'll worry about that some other day.
* Avoid running build_index_pathkeys() in situations where there cannotTom Lane2007-04-15
| | | | | | | possibly be any useful pathkeys --- to wit, queries with neither any join clauses nor any ORDER BY request. It's nearly free to check for this case and it saves a useful fraction of the planning time for simple queries.
* Refactor some lsyscache routines to eliminate duplicate code and saveTom Lane2007-01-21
| | | | a couple of syscache lookups in make_pathkey_from_sortinfo().
* Refactor planner's pathkeys data structure to create a separate, explicitTom Lane2007-01-20
| | | | | | | | | | | | | | representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.
* Support ORDER BY ... NULLS FIRST/LAST, and add ASC/DESC/NULLS FIRST/NULLS LASTTom Lane2007-01-09
| | | | | | | | | | | | per-column options for btree indexes. The planner's support for this is still pretty rudimentary; it does not yet know how to plan mergejoins with nondefault ordering options. The documentation is pretty rudimentary, too. I'll work on improving that stuff later. Note incompatible change from prior behavior: ORDER BY ... USING will now be rejected if the operator is not a less-than or greater-than member of some btree opclass. This prevents less-than-sane behavior if an operator that doesn't actually define a proper sort ordering is selected.
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Teach convert_subquery_pathkeys() to handle the case where theTom Lane2006-08-17
| | | | | | | | subquery's pathkey is a RelabelType applied to something that appears in the subquery's output; for example where the subquery returns a varchar Var and the sort order is shown as that Var coerced to text. This comes up because varchar doesn't have its own sort operator. Per example from Peter Hardman.
* Remove 576 references of include files that were not needed.Bruce Momjian2006-07-14
|
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Fix code that checks to see if an index can be considered to match the query'sTom Lane2006-01-29
| | | | | | | | | requested sort order. It was assuming that build_index_pathkeys always generates a pathkey per index column, which was not true if implied equality deduction had determined that two index columns were effectively equated to each other. Simplest fix seems to be to install an option that causes build_index_pathkeys to support this behavior as well as the original one. Per report from Brian Hirt.
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* Change the division of labor between grouping_planner and query_plannerTom Lane2005-08-27
| | | | | | | | | | | | | | so that the latter estimates the number of groups that grouping will produce. This is needed because it is primarily query_planner that makes the decision between fast-start and fast-finish plans, and in the original coding it was unable to make more than a crude rule-of-thumb choice when the query involved grouping. This revision helps us make saner choices for queries like SELECT ... GROUP BY ... LIMIT, as in a recent example from Mark Kirkwood. Also move the responsibility for canonicalizing sort_pathkeys and group_pathkeys into query_planner; this information has to be available anyway to support the first change, and doing it this way lets us get rid of compare_noncanonical_pathkeys entirely.
* Make use of new list primitives list_append_unique and list_concat_uniqueTom Lane2005-07-28
| | | | where applicable.
* Improve outer-join-deduction logic to be able to propagate equalitiesTom Lane2005-07-03
| | | | through multiple join clauses.
* Teach planner about some cases where a restriction clause can beTom Lane2005-07-02
| | | | | | | | | | | | | | | propagated inside an outer join. In particular, given LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that B = constant at the top level (B might be null instead), but we can nonetheless put a restriction B = constant into the quals for B's relation, since no inner-side rows not meeting that condition can contribute to the final result. Similarly, given FULL JOIN USING (J) WHERE J = constant, we can't directly conclude that either input J variable = constant, but it's OK to push such quals into each input rel. Per recent gripe from Kim Bisgaard. Along the way, remove 'valid_everywhere' flag from RestrictInfo, as on closer analysis it was not being used for anything, and was defined backwards anyway.
* Simplify the planner's join clause management by storing join clausesTom Lane2005-06-09
| | | | | | | | | | of a relation in a flat 'joininfo' list. The former arrangement grouped the join clauses according to the set of unjoined relids used in each; however, profiling on test cases involving lots of joins proves that that data structure is a net loss. It takes more time to group the join clauses together than is saved by avoiding duplicate tests later. It doesn't help any that there are usually not more than one or two clauses per group ...
* Remove planner's private fields from Query struct, and put them intoTom Lane2005-06-05
| | | | | | | | a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.
* Merge Resdom nodes into TargetEntry nodes to simplify code and save aTom Lane2005-04-06
| | | | | | | | | few palloc's. I also chose to eliminate the restype and restypmod fields entirely, since they are redundant with information stored in the node's contained expression; re-examining the expression at need seems simpler and more reliable than trying to keep restype/restypmod up to date. initdb forced due to change in contents of stored rules.
* Add a back-link from IndexOptInfo structs to their parent RelOptInfoTom Lane2005-03-27
| | | | | | structs. There are many places in the planner where we were passing both a rel and an index to subroutines, and now need only pass the index struct. Notationally simpler, and perhaps a tad faster.
* The result of a FULL or RIGHT join can't be assumed to be sorted by theTom Lane2005-01-23
| | | | | left input's sorting, because null rows may be inserted at various points. Per report from Ferenc Lutischá¸n.
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...