aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/path/joinpath.c
Commit message (Collapse)AuthorAge
* Fix best_inner_indexscan to return both the cheapest-total-cost andTom Lane2007-05-22
| | | | | | | | | | | | cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider both of these (when different) as the inside of a nestloop join. The original design was based on the assumption that indexscan paths always have negligible startup cost, and so total cost is the only important figure of merit; an assumption that's obviously broken by bitmap indexscans. This oversight could lead to choosing poor plans in cases where fast-start behavior is more important than total cost, such as LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from Chuck D.
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Fix an oversight in mergejoin planning: the planner would reject aTom Lane2006-08-17
| | | | | | | | | mergejoin possibility where the inner rel was less well sorted than the outer (ie, it matches some but not all of the merge clauses that can work with the outer), if the inner path in question is also the overall cheapest path for its rel. This is an old bug, but I'm not sure it's worth back-patching, because it's such a corner case. Noted while investigating a test case from Peter Hardman.
* Remove 576 references of include files that were not needed.Bruce Momjian2006-07-14
|
* Make the planner estimate costs for nestloop inner indexscans on the basisTom Lane2006-06-06
| | | | | | | | | | | | | | | | | | | | | that the Mackert-Lohmann formula applies across all the repetitions of the nestloop, not just each scan independently. We use the M-L formula to estimate the number of pages fetched from the index as well as from the table; that isn't what it was designed for, but it seems reasonably applicable anyway. This makes large numbers of repetitions look much cheaper than before, which accords with many reports we've received of overestimation of the cost of a nestloop. Also, change the index access cost model to charge random_page_cost per index leaf page touched, while explicitly not counting anything for access to metapage or upper tree pages. This may all need tweaking after we get some field experience, but in simple tests it seems to be giving saner results than before. The main thing is to get the infrastructure in place to let cost_index() and amcostestimate functions take repeated scans into account at all. Per my recent proposal. Note: this patch changes pg_proc.h, but I did not force initdb because the changes are basically cosmetic --- the system does not look into pg_proc to decide how to call an index amcostestimate function, and there's no way to call such a function from SQL at all.
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Improve my initial, rather hacky implementation of joins to appendTom Lane2006-02-05
| | | | | | | | relations: fix the executor so that we can have an Append plan on the inside of a nestloop and still pass down outer index keys to index scans within the Append, then generate such plans as if they were regular inner indexscans. This avoids the need to evaluate the outer relation multiple times.
* Fix constraint exclusion to work in inherited UPDATE/DELETE queriesTom Lane2006-02-04
| | | | | | | | | ... in fact, it will be applied now in any query whatsoever. I'm still a bit concerned about the cycles that might be expended in failed proof attempts, but given that CE is turned off by default, it's the user's choice whether to expend those cycles or not. (Possibly we should change the simple bool constraint_exclusion parameter to something more fine-grained?)
* Teach planner to convert simple UNION ALL subqueries into append relations,Tom Lane2006-02-03
| | | | | | | | | thereby sharing code with the inheritance case. This puts the UNION-ALL-view approach to partitioned tables on par with inheritance, so far as constraint exclusion is concerned: it works either way. (Still need to update the docs to say so.) The definition of "simple UNION ALL" is a little simpler than I would like --- basically the union arms can only be SELECT * FROM foo --- but it's good enough for partitioned-table cases.
* Restructure planner's handling of inheritance. Rather than processingTom Lane2006-01-31
| | | | | | | | | | | | | inheritance trees on-the-fly, which pretty well constrained us to considering only one way of planning inheritance, expand inheritance sets during the planner prep phase, and build a side data structure that can be consulted later to find which RTEs are members of which inheritance sets. As proof of concept, use the data structure to plan joins against inheritance sets more efficiently: we can now use indexes on the set members in inner-indexscan joins. (The generated plans could be improved further, but it'll take some executor changes.) This data structure will also support handling UNION ALL subqueries in the same way as inheritance sets, but that aspect of it isn't finished yet.
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Fix longstanding bug that would sometimes let the planner generate a bad planTom Lane2005-10-25
| | | | | | | | | | for an outer join; symptom is bogus error "RIGHT JOIN is only supported with merge-joinable join conditions". Problem was that select_mergejoin_clauses did its tests in the wrong order. We need to force left join not right join for a merge join when there are non-mergeable join clauses; but the test for this only accounted for mergejoinability of the clause operator, and not whether the left and right Vars were of the proper relations. Per report from Jean-Pierre Pelletier.
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* Remove planner's private fields from Query struct, and put them intoTom Lane2005-06-05
| | | | | | | | a new PlannerInfo struct, which is passed around instead of the bare Query in all the planning code. This commit is essentially just a code-beautification exercise, but it does open the door to making larger changes to the planner data structures without having to muck with the widely-known Query struct.
* Previous fix for "x FULL JOIN y ON true" failed to handle the caseTom Lane2005-05-24
| | | | | | | where there was also a WHERE-clause restriction that applied to the join. The check on restrictlist == NIL is really unnecessary anyway, because select_mergejoin_clauses already checked for and complained about any unmergejoinable join clauses. So just take it out.
* Create executor and planner-backend support for decoupled heap and indexTom Lane2005-04-19
| | | | | | | | | scans, using in-memory tuple ID bitmaps as the intermediary. The planner frontend (path creation and cost estimation) is not there yet, so none of this code can be executed. I have tested it using some hacked planner code that is far too ugly to see the light of day, however. Committing now so that the bulk of the infrastructure changes go in before the tree drifts under me.
* The result of a FULL or RIGHT join can't be assumed to be sorted by theTom Lane2005-01-23
| | | | | left input's sorting, because null rows may be inserted at various points. Per report from Ferenc Lutischá¸n.
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
* Pgindent run for 8.0.Bruce Momjian2004-08-29
|
* Update copyright to 2004.Bruce Momjian2004-08-29
|
* Use the new List API function names throughout the backend, and disable theNeil Conway2004-05-30
| | | | | list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.
* Reimplement the linked list data structure used throughout the backend.Neil Conway2004-05-26
| | | | | | | | | | | | | | | | In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.
* Support FULL JOIN with no join clauses, such as X FULL JOIN Y ON TRUE.Tom Lane2004-04-06
| | | | | | | That particular corner case is not exactly compelling, but given 7.4's ability to discard redundant join clauses, it is possible for the situation to arise from queries that are not so obviously silly. Per bug report of 6-Apr-04.
* Add the ability to extract OR indexscan conditions from OR-of-ANDTom Lane2004-01-05
| | | | | | | join conditions in which each OR subclause includes a constraint on the same relation. This implements the other useful side-effect of conversion to CNF format, without its unpleasant side-effects. As per pghackers discussion of a few weeks ago.
* Adjust the definition of RestrictInfo's left_relids and right_relidsTom Lane2003-12-30
| | | | | | | | | | fields: now they are valid whenever the clause is a binary opclause, not only when it is a potential join clause (there is a new boolean field canjoin to signal the latter condition). This lets us avoid recomputing the relid sets over and over while examining indexes. Still more work to do to make this as useful as it could be, because there are places that could use the info but don't have access to the RestrictInfo node.
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Message editing: remove gratuitous variations in message wording, standardizePeter Eisentraut2003-09-25
| | | | | terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
* Update copyrights to 2003.Bruce Momjian2003-08-04
|
* pgindent run.Bruce Momjian2003-08-04
|
* Error message editing in backend/optimizer, backend/rewrite.Tom Lane2003-07-25
|
* Replace planner's representation of relation sets, per pghackers discussion.Tom Lane2003-02-08
| | | | | Instead of Lists of integers, we now store variable-length bitmap sets. This should be faster as well as less error-prone.
* Upgrade cost estimation for joins, per discussion with Bradley Baetz.Tom Lane2003-01-27
| | | | | | | Try to model the effect of rescanning input tuples in mergejoins; account for JOIN_IN short-circuiting where appropriate. Also, recognize that mergejoin and hashjoin clauses may now be more than single operator calls, so we have to charge appropriate execution costs.
* IN clauses appearing at top level of WHERE can now be handled as joins.Tom Lane2003-01-20
| | | | | | | | | | There are two implementation techniques: the executor understands a new JOIN_IN jointype, which emits at most one matching row per left-hand row, or the result of the IN's sub-select can be fed through a DISTINCT filter and then joined as an ordinary relation. Along the way, some minor code cleanup in the optimizer; notably, break out most of the jointree-rearrangement preprocessing in planner.c and put it in a new file prep/prepjointree.c.
* Allow merge and hash joins to occur on arbitrary expressions (anything notTom Lane2003-01-15
| | | | | | | | | | | | | | | | containing a volatile function), rather than only on 'Var = Var' clauses as before. This makes it practical to do flatten_join_alias_vars at the start of planning, which in turn eliminates a bunch of klugery inside the planner to deal with alias vars. As a free side effect, we now detect implied equality of non-Var expressions; for example in SELECT ... WHERE a.x = b.y and b.y = 42 we will deduce a.x = 42 and use that as a restriction qual on a. Also, we can remove the restriction introduced 12/5/02 to prevent pullup of subqueries whose targetlists contain sublinks. Still TODO: make statistical estimation routines in selfuncs.c and costsize.c smarter about expressions that are more complex than plain Vars. The need for this is considerably greater now that we have to be able to estimate the suitability of merge and hash join techniques on such expressions.
* Be more realistic about plans involving Materialize nodes: take theirTom Lane2002-11-30
| | | | cost into account while planning.
* Upgrade planner and executor to allow multiple hash keys for a hash join,Tom Lane2002-11-30
| | | | | | instead of only one. This should speed up planning (only one hash path to consider for a given pair of relations) as well as allow more effective hashing, when there are multiple hashable joinclauses.
* Restructure planning of nestloop inner indexscans so that the set of usableTom Lane2002-11-24
| | | | | | | | | | | joinclauses is determined accurately for each join. Formerly, the code only considered joinclauses that used all of the rels from the outer side of the join; thus for example FROM (a CROSS JOIN b) JOIN c ON (c.f1 = a.x AND c.f2 = b.y) could not exploit a two-column index on c(f1,f2), since neither of the qual clauses would be in the joininfo list it looked in. The new code does this correctly, and also is able to eliminate redundant clauses, thus fixing the problem noted 24-Oct-02 by Hans-Jürgen Schönig.
* pgindent run.Bruce Momjian2002-09-04
|
* Remove sys/types.h in files that include postgres.h, and hence c.h,Bruce Momjian2002-09-02
| | | | because c.h has sys/types.h.
* Update copyright to 2002.Bruce Momjian2002-06-20
|
* Restructure representation of join alias variables. An explicit JOINTom Lane2002-03-12
| | | | | | | | | | | | | | | now has an RTE of its own, and references to its outputs now are Vars referencing the JOIN RTE, rather than CASE-expressions. This allows reverse-listing in ruleutils.c to use the correct alias easily, rather than painfully reverse-engineering the alias namespace as it used to do. Also, nested FULL JOINs work correctly, because the result of the inner joins are simple Vars that the planner can cope with. This fixes a bug reported a couple times now, notably by Tatsuo on 18-Nov-01. The alias Vars are expanded into COALESCE expressions where needed at the very end of planning, rather than during parsing. Also, beginnings of support for showing plan qualifier expressions in EXPLAIN. There are probably still cases that need work. initdb forced due to change of stored-rule representation.
* sort_inner_and_outer needs a check to ensure that it's consumed all theTom Lane2001-11-11
| | | | | | mergeclauses in RIGHT/FULL join cases, just like the other routines have. I'm not quite sure why I thought it didn't need one --- but Nick Fankhauser's recent bug report proves that it does.
* pgindent run on all C files. Java run to follow. initdb/regressionBruce Momjian2001-10-25
| | | | tests pass.
* Further work on making use of new statistics in planner. Adjust APIsTom Lane2001-06-05
| | | | | | | | | of costsize.c routines to pass Query root, so that costsize can figure more things out by itself and not be so dependent on its callers to tell it everything it needs to know. Use selectivity of hash or merge clause to estimate number of tuples processed internally in these joins (this is more useful than it would've been before, since eqjoinsel is somewhat more accurate than before).
* Rewrite of planner statistics-gathering code. ANALYZE is now available asTom Lane2001-05-07
| | | | | | | | | | | | | | | | | a separate statement (though it can still be invoked as part of VACUUM, too). pg_statistic redesigned to be more flexible about what statistics are stored. ANALYZE now collects a list of several of the most common values, not just one, plus a histogram (not just the min and max values). Random sampling is used to make the process reasonably fast even on very large tables. The number of values and histogram bins collected is now user-settable via an ALTER TABLE command. There is more still to do; the new stats are not being used everywhere they could be in the planner. But the remaining changes for this project should be localized, and the behavior is already better than before. A not-very-related change is that sorting now makes use of btree comparison routines if it can find one, rather than invoking '<' twice.
* Prevent generation of invalid plans for RIGHT or FULL joins with multipleTom Lane2001-04-15
| | | | | | | | join clauses. The mergejoin executor wants all the join clauses to appear as merge quals, not as extra joinquals, for these kinds of joins. But the planner would consider plans in which partially-sorted input paths were used, leading to only some of the join clauses becoming merge quals. This is fine for inner/left joins, not fine for right/full joins.
* pgindent run. Make it all clean.Bruce Momjian2001-03-22
|
* Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group.Bruce Momjian2001-01-24
|
* Planner speedup hacking. Avoid saving useless pathkeys, so that pathTom Lane2000-12-14
| | | | | | | | | | | | comparison does not consider paths different when they differ only in uninteresting aspects of sort order. (We had a special case of this consideration for indexscans already, but generalize it to apply to ordered join paths too.) Be stricter about what is a canonical pathkey to allow faster pathkey comparison. Cache canonical pathkeys and dispersion stats for left and right sides of a RestrictInfo's clause, to avoid repeated computation. Total speedup will depend on number of tables in a query, but I see about 4x speedup of planning phase for a sample seven-table query.
* Ensure that mergejoin plan will be considered for FULL OUTER JOIN evenTom Lane2000-11-23
| | | | | if enable_mergejoin = OFF. Must do this, because we have no other implementation method for full joins.