aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/prep/prepunion.c
Commit message (Collapse)AuthorAge
* Rethink the GetForeignUpperPaths API (again).Tom Lane2016-07-01
| | | | | | | | | | | | | | In the previous design, the GetForeignUpperPaths FDW callback hook was called before we got around to labeling upper relations with the proper consider_parallel flag; this meant that any upper paths created by an FDW would be marked not-parallel-safe. While that's probably just as well right now, we aren't going to want it to be true forever. Hence, abandon the idea that FDWs should be allowed to inject upper paths before the core code has gotten around to creating the relevant upper relation. (Well, actually they still can, but it's on their own heads how well it works.) Instead, adopt the same API already designed for create_upper_paths_hook: we call GetForeignUpperPaths after each upperrel has been created and populated with the paths the core planner knows how to make.
* Rethink node-level representation of partial-aggregation modes.Tom Lane2016-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit 59a3795c2 got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>
* Refactor planning of projection steps that don't need a Result plan node.Tom Lane2016-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original upper-planner-pathification design (commit 3fc6e2d7f5b652b4) assumed that we could always determine during Path formation whether or not we would need a Result plan node to perform projection of a targetlist. That turns out not to work very well, though, because createplan.c still has some responsibilities for choosing the specific target list associated with sorting/grouping nodes (in particular it might choose to add resjunk columns for sorting). We might not ever refactor that --- doing so would push more work into Path formation, which isn't attractive --- and we certainly won't do so for 9.6. So, while create_projection_path and apply_projection_to_path can tell for sure what will happen if the subpath is projection-capable, they can't tell for sure when it isn't. This is at least a latent bug in apply_projection_to_path, which might think it can apply a target to a non-projecting node when the node will end up computing something different. Also, I'd tied the creation of a ProjectionPath node to whether or not a Result is needed, but it turns out that we sometimes need a ProjectionPath node anyway to avoid modifying a possibly-shared subpath node. Callers had to use create_projection_path for such cases, and we added code to them that knew about the potential omission of a Result node and attempted to adjust the cost estimates for that. That was uncertainly correct and definitely ugly/unmaintainable. To fix, have create_projection_path explicitly check whether a Result is needed and adjust its cost estimate accordingly, though it creates a ProjectionPath in either case. apply_projection_to_path is now mostly just an optimized version that can avoid creating an extra Path node when the input is known to not be shared with any other live path. (There is one case that create_projection_path doesn't handle, which is pushing parallel-safe expressions below a Gather node. We could make it do that by duplicating the GatherPath, but there seems no need as yet.) create_projection_plan still has to recheck the tlist-match condition, which means that if the matching situation does get changed by createplan.c then we'll have made a slightly incorrect cost estimate. But there seems no help for that in the near term, and I doubt it occurs often enough, let alone would change planning decisions often enough, to be worth stressing about. I added a "dummypp" field to ProjectionPath to track whether create_projection_path thinks a Result is needed. This is not really necessary as-committed because create_projection_plan doesn't look at the flag; but it seems like a good idea to remember what we thought when forming the cost estimate, if only for debugging purposes. In passing, get rid of the target_parallel parameter added to apply_projection_to_path by commit 54f5c5150. I don't think that's a good idea because it involves callers in what should be an internal decision, and opens us up to missing optimization opportunities if callers think they don't need to provide a valid flag, as most don't. For the moment, this just costs us an extra has_parallel_hazard call when planning a Gather. If that starts to look expensive, I think a better solution would be to teach PathTarget to carry/cache knowledge of parallel-safety of its contents.
* Try again to fix the way the scanjoin_target is used with partial paths.Robert Haas2016-06-17
| | | | | | | | | | | | | | | | Commit 04ae11f62e643e07c411c4935ea6af46cb112aa9 removed some broken code to apply the scan/join target to partial paths, but its theory that this processing step is totally unnecessary turns out to be wrong. Put similar code back again, but this time, check for parallel-safety and avoid in-place modifications to paths that may already have been used as part of some other path. (This is not an entirely elegant solution to this problem; it might be better, for example, to postpone generate_gather_paths for the topmost scan/join rel until after the scan/join target has been applied. But this is not the time for such redesign work.) Amit Kapila and Robert Haas
* Redefine create_upper_paths_hook as being invoked once per upper relation.Tom Lane2016-04-12
| | | | | | Per discussion, this gives potential users of the hook more flexibility, because they can build custom Paths that implement only one stage of upper processing atop core-provided Paths for earlier stages.
* Allow aggregate transition states to be serialized and deserialized.Robert Haas2016-03-29
| | | | | | | | | This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.
* Support parallel aggregation.Robert Haas2016-03-21
| | | | | | | | | Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.
* Make the upper part of the planner work by generating and comparing Paths.Tom Lane2016-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've been saying we needed to do this for more than five years, and here it finally is. This patch removes the ever-growing tangle of spaghetti logic that grouping_planner() used to use to try to identify the best plan for post-scan/join query steps. Now, there is (nearly) independent consideration of each execution step, and entirely separate construction of Paths to represent each of the possible ways to do that step. We choose the best Path or set of Paths using the same add_path() logic that's been used inside query_planner() for years. In addition, this patch removes the old restriction that subquery_planner() could return only a single Plan. It now returns a RelOptInfo containing a set of Paths, just as query_planner() does, and the parent query level can use each of those Paths as the basis of a SubqueryScanPath at its level. This allows finding some optimizations that we missed before, wherein a subquery was capable of returning presorted data and thereby avoiding a sort in the parent level, making the overall cost cheaper even though delivering sorted output was not the cheapest plan for the subquery in isolation. (A couple of regression test outputs change in consequence of that. However, there is very little change in visible planner behavior overall, because the point of this patch is not to get immediate planning benefits but to create the infrastructure for future improvements.) There is a great deal left to do here. This patch unblocks a lot of planner work that was basically impractical in the old code structure, such as allowing FDWs to implement remote aggregation, or rewriting plan_set_operations() to allow consideration of multiple implementation orders for set operations. (The latter will likely require a full rewrite of plan_set_operations(); what I've done here is only to fix it to return Paths not Plans.) I have also left unfinished some localized refactoring in createplan.c and planner.c, because it was not necessary to get this patch to a working state. Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
* Support multi-stage aggregation.Robert Haas2016-01-20
| | | | | | | | | | | | | | | | | | | Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Get rid of the planner's LateralJoinInfo data structure.Tom Lane2015-12-11
| | | | | | | | | | | | | | | | | | | | I originally modeled this data structure on SpecialJoinInfo, but after commit acfcd45cacb6df23 that looks like a pretty poor decision. All we really need is relid sets identifying laterally-referenced rels; and most of the time, what we want to know about includes indirect lateral references, a case the LateralJoinInfo data was unsuited to compute with any efficiency. The previous commit redefined RelOptInfo.lateral_relids as the transitive closure of lateral references, so that it easily supports checking indirect references. For the places where we really do want just direct references, add a new RelOptInfo field direct_lateral_relids, which is easily set up as a copy of lateral_relids before we perform the transitive closure calculation. Then we can just drop lateral_info_list and LateralJoinInfo and the supporting code. This makes the planner's handling of lateral references noticeably more efficient, and shorter too. Such a change can't be back-patched into stable branches for fear of breaking extensions that might be looking at the planner's data structures; but it seems not too late to push it into 9.5, so I've done so.
* Support GROUPING SETS, CUBE and ROLLUP.Andres Freund2015-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
* Represent columns requiring insert and update privileges indentently.Andres Freund2015-05-08
| | | | | | | | | | | | | | | | | | | Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund
* Allow foreign tables to participate in inheritance.Tom Lane2015-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. As side effects of this, allow foreign tables to have NOT VALID CHECK constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS. Continuing to disallow these things would've required bizarre and inconsistent special cases in inheritance behavior. Since foreign tables don't enforce CHECK constraints anyway, a NOT VALID one is a complete no-op, but that doesn't mean we shouldn't allow it. And it's possible that some FDWs might have use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops for most. An additional change in support of this is that when a ModifyTable node has multiple target tables, they will all now be explicitly identified in EXPLAIN output, for example: Update on pt1 (cost=0.00..321.05 rows=3541 width=46) Update on pt1 Foreign Update on ft1 Foreign Update on ft2 Update on child3 -> Seq Scan on pt1 (cost=0.00..0.00 rows=1 width=46) -> Foreign Scan on ft1 (cost=100.00..148.03 rows=1170 width=46) -> Foreign Scan on ft2 (cost=100.00..148.03 rows=1170 width=46) -> Seq Scan on child3 (cost=0.00..25.00 rows=1200 width=46) This was done mainly to provide an unambiguous place to attach "Remote SQL" fields, but it is useful for inherited updates even when no foreign tables are involved. Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me
* Improve representation of PlanRowMark.Tom Lane2015-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes two inadequacies of the PlanRowMark representation. First, that the original LockingClauseStrength isn't stored (and cannot be inferred for foreign tables, which always get ROW_MARK_COPY). Since some PlanRowMarks are created out of whole cloth and don't actually have an ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to enum LockingClauseStrength, which is fairly annoying but the alternatives seem worse. This fix allows getting rid of the use of get_parse_rowmark() in FDWs (as per the discussion around commits 462bd95705a0c23b and 8ec8760fc87ecde0), and it simplifies some things elsewhere. Second, that the representation assumed that all child tables in an inheritance hierarchy would use the same RowMarkType. That's true today but will soon not be true. We add an "allMarkTypes" field that identifies the union of mark types used in all a parent table's children, and use that where appropriate (currently, only in preprocess_targetlist()). In passing fix a couple of minor infelicities left over from the SKIP LOCKED patch, notably that _outPlanRowMark still thought waitPolicy is a bool. Catversion bump is required because the numeric values of enum LockingClauseStrength can appear in on-disk rules. Extracted from a much larger patch to support foreign table inheritance; it seemed worth breaking this out, since it's a separable concern. Shigeru Hanada and Etsuro Fujita, somewhat modified by me
* Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[].Tom Lane2015-02-21
| | | | | | | This requires changing quite a few places that were depending on sizeof(HeapTupleHeaderData), but it seems for the best. Michael Paquier, some adjustments by me
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* Implement SKIP LOCKED for row-level locksAlvaro Herrera2014-10-07
| | | | | | | | | | | | | | | | | | This clause changes the behavior of SELECT locking clauses in the presence of locked rows: instead of causing a process to block waiting for the locks held by other processes (or raise an error, with NOWAIT), SKIP LOCKED makes the new reader skip over such rows. While this is not appropriate behavior for general purposes, there are some cases in which it is useful, such as queue-like tables. Catalog version bumped because this patch changes the representation of stored rules. Reviewed by Craig Ringer (based on a previous attempt at an implementation by Simon Riggs, who also provided input on the syntax used in the current patch), David Rowley, and Álvaro Herrera. Author: Thomas Munro
* Fix some more problems with nested append relations.Tom Lane2014-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of commit a87c72915 (which later got backpatched as far as 9.1), we're explicitly supporting the notion that append relations can be nested; this can occur when UNION ALL constructs are nested, or when a UNION ALL contains a table with inheritance children. Bug #11457 from Nelson Page, as well as an earlier report from Elvis Pranskevichus, showed that there were still nasty bugs associated with such cases: in particular the EquivalenceClass mechanism could try to generate "join" clauses connecting an appendrel child to some grandparent appendrel, which would result in assertion failures or bogus plans. Upon investigation I concluded that all current callers of find_childrel_appendrelinfo() need to be fixed to explicitly consider multiple levels of parent appendrels. The most complex fix was in processing of "broken" EquivalenceClasses, which are ECs for which we have been unable to generate all the derived equality clauses we would like to because of missing cross-type equality operators in the underlying btree operator family. That code path is more or less entirely untested by the regression tests to date, because no standard opfamilies have such holes in them. So I wrote a new regression test script to try to exercise it a bit, which turned out to be quite a worthwhile activity as it exposed existing bugs in all supported branches. The present patch is essentially the same as far back as 9.2, which is where parameterized paths were introduced. In 9.0 and 9.1, we only need to back-patch a small fragment of commit 5b7b5518d, which fixes failure to propagate out the original WHERE clauses when a broken EC contains constant members. (The regression test case results show that these older branches are noticeably stupider than 9.2+ in terms of the quality of the plans generated; but we don't really care about plan quality in such cases, only that the plan not be outright wrong. A more invasive fix in the older branches would not be a good idea anyway from a plan-stability standpoint.)
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Make security barrier views automatically updatableStephen Frost2014-04-12
| | | | | | | | | | | | | | | | | Views which are marked as security_barrier must have their quals applied before any user-defined quals are called, to prevent user-defined functions from being able to see rows which the security barrier view is intended to prevent them from seeing. Remove the restriction on security barrier views being automatically updatable by adding a new securityQuals list to the RTE structure which keeps track of the quals from security barrier views at each level, independently of the user-supplied quals. When RTEs are later discovered which have securityQuals populated, they are turned into subquery RTEs which are marked as security_barrier to prevent any user-supplied quals being pushed down (modulo LEAKPROOF quals). Dean Rasheed, reviewed by Craig Ringer, Simon Riggs, KaiGai Kohei
* Update copyright for 2014Bruce Momjian2014-01-07
| | | | | Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Get rid of COERCE_DONTCARE.Tom Lane2012-10-12
| | | | We don't need this hack any more.
* Fix PARAM_EXEC assignment mechanism to be safe in the presence of WITH.Tom Lane2012-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The planner previously assumed that parameter Vars having the same absolute query level, varno, and varattno could safely be assigned the same runtime PARAM_EXEC slot, even though they might be different Vars appearing in different subqueries. This was (probably) safe before the introduction of CTEs, but the lazy-evalution mechanism used for CTEs means that a CTE can be executed during execution of some other subquery, causing the lifespan of Params at the same syntactic nesting level as the CTE to overlap with use of the same slots inside the CTE. In 9.1 we created additional hazards by using the same parameter-assignment technology for nestloop inner scan parameters, but it was broken before that, as illustrated by the added regression test. To fix, restructure the planner's management of PlannerParamItems so that items having different semantic lifespans are kept rigorously separated. This will probably result in complex queries using more runtime PARAM_EXEC slots than before, but the slots are cheap enough that this hardly matters. Also, stop generating PlannerParamItems containing Params for subquery outputs: all we really need to do is reserve the PARAM_EXEC slot number, and that now only takes incrementing a counter. The planning code is simpler and probably faster than before, as well as being more correct. Per report from Vik Reykja. These changes will mostly also need to be made in the back branches, but I'm going to hold off on that until after 9.2.0 wraps.
* Split tuple struct defs from htup.h to htup_details.hAlvaro Herrera2012-08-30
| | | | | | | | | | | | This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.
* Split heapam_xlog.h from heapam.hAlvaro Herrera2012-08-28
| | | | | | | | | | | | The heapam XLog functions are used by other modules, not all of which are interested in the rest of the heapam API. With this, we let them get just the XLog stuff in which they are interested and not pollute them with unrelated includes. Also, since heapam.h no longer requires xlog.h, many files that do include heapam.h no longer get xlog.h automatically, including a few headers. This is useful because heapam.h is getting pulled in by execnodes.h, which is in turn included by a lot of files.
* Fix up planner infrastructure to support LATERAL properly.Tom Lane2012-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch takes care of a number of problems having to do with failure to choose valid join orders and incorrect handling of lateral references pulled up from subqueries. Notable changes: * Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to represent join ordering constraints created by lateral references. (I first considered extending the SpecialJoinInfo structure, but the semantics are different enough that a separate data structure seems better.) Extend join_is_legal() and related functions to prevent trying to form unworkable joins, and to ensure that we will consider joins that satisfy lateral references even if the joins would be clauseless. * Fill in the infrastructure needed for the last few types of relation scan paths to support parameterization. We'd have wanted this eventually anyway, but it is necessary now because a relation that gets pulled up out of a UNION ALL subquery may acquire a reltargetlist containing lateral references, meaning that its paths *have* to be parameterized whether or not we have any code that can push join quals down into the scan. * Compute data about lateral references early in query_planner(), and save in RelOptInfo nodes, to avoid repetitive calculations later. * Assorted corner-case bug fixes. There's probably still some bugs left, but this is a lot closer to being real than it was before.
* Replace int2/int4 in C code with int16/int32Peter Eisentraut2012-06-25
| | | | | | | | | | The latter was already the dominant use, and it's preferable because in C the convention is that intXX means XX bits. Therefore, allowing mixed use of int2, int4, int8, int16, int32 is obviously confusing. Remove the typedefs for int2 and int4 for now. They don't seem to be widely used outside of the PostgreSQL source tree, and the few uses can probably be cleaned up by the time this ships.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Revise parameterized-path mechanism to fix assorted issues.Tom Lane2012-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adjusts the treatment of parameterized paths so that all paths with the same parameterization (same set of required outer rels) for the same relation will have the same rowcount estimate. We cache the rowcount estimates to ensure that property, and hopefully save a few cycles too. Doing this makes it practical for add_path_precheck to operate without a rowcount estimate: it need only assume that paths with different parameterizations never dominate each other, which is close enough to true anyway for coarse filtering, because normally a more-parameterized path should yield fewer rows thanks to having more join clauses to apply. In add_path, we do the full nine yards of comparing rowcount estimates along with everything else, so that we can discard parameterized paths that don't actually have an advantage. This fixes some issues I'd found with add_path rejecting parameterized paths on the grounds that they were more expensive than not-parameterized ones, even though they yielded many fewer rows and hence would be cheaper once subsequent joining was considered. To make the same-rowcounts assumption valid, we have to require that any parameterized path enforce *all* join clauses that could be obtained from the particular set of outer rels, even if not all of them are useful for indexing. This is required at both base scans and joins. It's a good thing anyway since the net impact is that join quals are checked at the lowest practical level in the join tree. Hence, discard the original rather ad-hoc mechanism for choosing parameterization joinquals, and build a better one that has a more principled rule for when clauses can be moved. The original rule was actually buggy anyway for lack of knowledge about which relations are part of an outer join's outer side; getting this right requires adding an outer_relids field to RestrictInfo.
* Preserve column names in the execution-time tupledesc for a RowExpr.Tom Lane2012-02-14
| | | | | | | | | | | | | | | The hstore and json datatypes both have record-conversion functions that pay attention to column names in the composite values they're handed. We used to not worry about inserting correct field names into tuple descriptors generated at runtime, but given these examples it seems useful to do so. Observe the nicer-looking results in the regression tests whose results changed. catversion bump because there is a subtle change in requirements for stored rule parsetrees: RowExprs from ROW() constructs now have to include field names. Andrew Dunstan and Tom Lane
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Rearrange planner to save the whole PlannerInfo (subroot) for a subquery.Tom Lane2011-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | Formerly, set_subquery_pathlist and other creators of plans for subqueries saved only the rangetable and rowMarks lists from the lower-level PlannerInfo. But there's no reason not to remember the whole PlannerInfo, and indeed this turns out to simplify matters in a number of places. The immediate reason for doing this was so that the subroot will still be accessible when we're trying to extract column statistics out of an already-planned subquery. But now that I've done it, it seems like a good code-beautification effort in its own right. I also chose to get rid of the transient subrtable and subrowmark fields in SubqueryScan nodes, in favor of having setrefs.c look up the subquery's RelOptInfo. That required changing all the APIs in setrefs.c to pass PlannerInfo not PlannerGlobal, which was a large but quite mechanical transformation. One side-effect not foreseen at the beginning is that this finally broke inheritance_planner's assumption that replanning the same subquery RTE N times would necessarily give interchangeable results each time. That assumption was always pretty risky, but now we really have to make a separate RTE for each instance so that there's a place to carry the separate subroots.
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* Pgindent run before 9.1 beta2.Bruce Momjian2011-06-09
|
* Improve cost estimation for aggregates and window functions.Tom Lane2011-04-24
| | | | | | | | | | | | | | | | The previous coding failed to account properly for the costs of evaluating the input expressions of aggregates and window functions, as seen in a recent gripe from Claudio Freire. (I said at the time that it wasn't counting these costs at all; but on closer inspection, it was effectively charging these costs once per output tuple. That is completely wrong for aggregates, and not exactly right for window functions either.) There was also a hard-wired assumption that aggregates and window functions had procost 1.0, which is now fixed to respect the actual cataloged costs. The costing of WindowAgg is still pretty bogus, since it doesn't try to estimate the effects of spilling data to disk, but that seems like a separate issue.
* Clean up collation processing in prepunion.c.Tom Lane2011-04-16
| | | | | | | | | | | | This area was a few bricks shy of a load, and badly under-commented too. We have to ensure that the generated targetlist entries for a set-operation node expose the correct collation for each entry, since higher-level processing expects the tlist to reflect the true ordering of the plan's output. This hackery wouldn't be necessary if SortGroupClause carried collation info ... but making it do so would inject more pain in the parser than would be saved here. Still, we might want to rethink that sometime.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Pass collation to makeConst() instead of looking it up internally.Tom Lane2011-03-25
| | | | | | | | | In nearly all cases, the caller already knows the correct collation, and in a number of places, the value the caller has handy is more correct than the default for the type would be. (In particular, this patch makes it significantly less likely that eval_const_expressions will result in changing the exposed collation of an expression.) So an internal lookup is both expensive and wrong.
* Fix improper matching of resjunk column names for FOR UPDATE in subselect.Tom Lane2011-02-09
| | | | | | | | | | | | | | | | Flattening of subquery range tables during setrefs.c could lead to the rangetable indexes in PlanRowMark nodes not matching up with the column names previously assigned to the corresponding resjunk ctid (resp. tableoid or wholerow) columns. Typical symptom would be either a "cannot extract system attribute from virtual tuple" error or an Assert failure. This wasn't a problem before 9.0 because we didn't support FOR UPDATE below the top query level, and so the final flattening could never renumber an RTE that was relevant to FOR UPDATE. Fix by using a plan-tree-wide unique number for each PlanRowMark to label the associated resjunk columns, so that the number need not change during flattening. Per report from David Johnston (though I'm darned if I can see how this got past initial testing of the relevant code). Back-patch to 9.0.
* Per-column collation supportPeter Eisentraut2011-02-08
| | | | | | | | This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
* Fix PlanRowMark/ExecRowMark structures to handle inheritance correctly.Tom Lane2011-01-12
| | | | | | | | | | | | | | | | | | In an inherited UPDATE/DELETE, each target table has its own subplan, because it might have a column set different from other targets. This means that the resjunk columns we add to support EvalPlanQual might be at different physical column numbers in each subplan. The EvalPlanQual rewrite I did for 9.0 failed to account for this, resulting in possible misbehavior or even crashes during concurrent updates to the same row, as seen in a recent report from Gordon Shannon. Revise the data structure so that we track resjunk column numbers separately for each subplan. I also chose to move responsibility for identifying the physical column numbers back to executor startup, instead of assuming that numbers derived during preprocess_targetlist would stay valid throughout subsequent massaging of the plan. That's a bit slower, so we might want to consider undoing it someday; but it would complicate the patch considerably and didn't seem justifiable in a bug fix that has to be back-patched to 9.0.
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Reimplement planner's handling of MIN/MAX aggregate optimization.Tom Lane2010-11-04
| | | | | | | | | | | | | | | | | | | | | | Per my recent proposal, get rid of all the direct inspection of indexes and manual generation of paths in planagg.c. Instead, set up EquivalenceClasses for the aggregate argument expressions, and let the regular path generation logic deal with creating paths that can satisfy those sort orders. This makes planagg.c a bit more visible to the rest of the planner than it was originally, but the approach is basically a lot cleaner than before. A major advantage of doing it this way is that we get MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans) practically for free, whereas in the old way we'd have had to add a whole lot more duplicative logic. One small disadvantage of this approach is that MIN/MAX aggregates can no longer exploit partial indexes having an "x IS NOT NULL" predicate, unless that restriction or something that implies it is specified in the query. The previous implementation was able to use the added "x IS NOT NULL" condition as an extra predicate proof condition, but in this version we rely entirely on indexes that are considered usable by the main planning process. That seems a fair tradeoff for the simplicity and functionality gained.
* Oops, missed one fix for EquivalenceClass rearrangement.Tom Lane2010-10-29
| | | | | | | | | | Now that we're expecting a mergeclause's left_ec/right_ec to persist from the initial assignments, we can't just blithely zero these out when transforming such a clause in adjust_appendrel_attrs. But really it should be okay to keep the parent's values, since a child table's derived Var ought to be equivalent to the parent Var for all EquivalenceClass purposes. (Indeed, I'm wondering whether we couldn't find a way to dispense with add_child_rel_equivalences altogether. But this is wrong in any case.)
* Teach CLUSTER to use seqscan-and-sort when it's faster than indexscan.Tom Lane2010-10-07
| | | | | | ... or at least, when the planner's cost estimates say it will be faster. Leonardo Francalanci, reviewed by Itagaki Takahiro and Tom Lane
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* pgindent run for 9.0, second runBruce Momjian2010-07-06
|
* Fix incorrect patch that removed permission checks on inheritance childTom Lane2010-05-11
| | | | | tables --- the parent table no longer got checked, either. Per bug #5458 from Takahiro Itagaki.