| Commit message (Collapse) | Author | Age |
... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.
This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.
The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.
The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.
Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage. The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting. The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.
A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.
Needs a catversion bump because stored rules may change.
Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically the tlist and rti of the pseudo "excluded" relation weren't
properly treated by expression_tree_walker, which lead to errors when
excluded was referenced inside a rule because the varnos where not
properly adjusted. Similar omissions in OffsetVarNodes and
expression_tree_mutator had less impact, but should obviously be fixed
nonetheless.
A couple tests of for ON CONFLICT UPDATE into INSERT rule bearing
relations have been added.
In passing I updated a couple comments.
|
|
|
|
|
|
|
|
| |
We did not need a location tag on NullTest or BooleanTest before, because
no error messages referred directly to their locations. That's planned
to change though, so add these fields in a separate housekeeping commit.
Catversion bump because stored rules may change.
|
|
|
|
| |
Backpatch certain files through 9.0
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a way of iterating through the members of a bitmapset
nondestructively, unlike the old way with bms_first_member(). While
bms_next_member() is very slightly slower than bms_first_member()
(at least for typical-size bitmapsets), eliminating the need to palloc
and pfree a temporary copy of the target bitmapset is a significant win.
So this method should be preferred in all cases where a temporary copy
would be necessary.
Tom Lane, with suggestions from Dean Rasheed and David Rowley
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
|
|
|
|
|
| |
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
|
|
|
|
|
| |
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
|
|
|
|
|
|
|
|
| |
Commit 95ef6a344821655ce4d0a74999ac49dd6af6d342 removed the
ability to create rules on an individual column as of 7.3, but
left some residual code which has since been useless. This cleans
up that dead code without any change in behavior other than
dropping the useless column from the catalog.
|
|
|
|
|
| |
This is the first run of the Perl-based pgindent script. Also update
pgindent instructions.
|
|
|
|
|
| |
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function currently lacks the option to throw error if the provided
targetlist doesn't have any matching entry for a Var to be replaced.
Two of the four existing call sites would be better off with an error,
as would the usage in the pending auto-updatable-views patch, so it seems
past time to extend the API to support that. To do so, replace the "event"
parameter (historically of type CmdType, though it was declared plain int)
with a special-purpose enum type.
It's unclear whether this function might be called by third-party code.
Since many C compilers wouldn't warn about a call site continuing to use
the old calling convention, rename the function to forcibly break any
such code that hasn't been updated. The old name was none too well chosen
anyhow.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch takes care of a number of problems having to do with failure
to choose valid join orders and incorrect handling of lateral references
pulled up from subqueries. Notable changes:
* Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to
represent join ordering constraints created by lateral references.
(I first considered extending the SpecialJoinInfo structure, but the
semantics are different enough that a separate data structure seems
better.) Extend join_is_legal() and related functions to prevent trying
to form unworkable joins, and to ensure that we will consider joins that
satisfy lateral references even if the joins would be clauseless.
* Fill in the infrastructure needed for the last few types of relation scan
paths to support parameterization. We'd have wanted this eventually
anyway, but it is necessary now because a relation that gets pulled up out
of a UNION ALL subquery may acquire a reltargetlist containing lateral
references, meaning that its paths *have* to be parameterized whether or
not we have any code that can push join quals down into the scan.
* Compute data about lateral references early in query_planner(), and save
in RelOptInfo nodes, to avoid repetitive calculations later.
* Assorted corner-case bug fixes.
There's probably still some bugs left, but this is a lot closer to being
real than it was before.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Formerly we relied on checking after-the-fact to see if an expression
contained aggregates, window functions, or sub-selects when it shouldn't.
This is grotty, easily forgotten (indeed, we had forgotten to teach
DefineIndex about rejecting window functions), and none too efficient
since it requires extra traversals of the parse tree. To improve matters,
define an enum type that classifies all SQL sub-expressions, store it in
ParseState to show what kind of expression we are currently parsing, and
make transformAggregateCall, transformWindowFuncCall, and transformSubLink
check the expression type and throw error if the type indicates the
construct is disallowed. This allows removal of a large number of ad-hoc
checks scattered around the code base. The enum type is sufficiently
fine-grained that we can still produce error messages of at least the
same specificity as before.
Bringing these error checks together revealed that we'd been none too
consistent about phrasing of the error messages, so standardize the wording
a bit.
Also, rewrite checking of aggregate arguments so that it requires only one
traversal of the arguments, rather than up to three as before.
In passing, clean up some more comments left over from add_missing_from
support, and annotate some tests that I think are dead code now that that's
gone. (I didn't risk actually removing said dead code, though.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a CHECK constraint or index definition contained a whole-row Var (that
is, "table.*"), an attempt to copy that definition via CREATE TABLE LIKE or
table inheritance produced incorrect results: the copied Var still claimed
to have the rowtype of the source table, rather than the created table.
For the LIKE case, it seems reasonable to just throw error for this
situation, since the point of LIKE is that the new table is not permanently
coupled to the old, so there's no reason to assume its rowtype will stay
compatible. In the inheritance case, we should ideally allow such
constraints, but doing so will require nontrivial refactoring of CREATE
TABLE processing (because we'd need to know the OID of the new table's
rowtype before we adjust inherited CHECK constraints). In view of the lack
of previous complaints, that doesn't seem worth the risk in a back-patched
bug fix, so just make it throw error for the inheritance case as well.
Along the way, replace change_varattnos_of_a_node() with a more robust
function map_variable_attnos(), which is capable of being extended to
handle insertion of ConvertRowtypeExpr whenever we get around to fixing
the inheritance case nicely, and in the meantime it returns a failure
indication to the caller so that a helpful message with some context can be
thrown. Also, this code will do the right thing with subselects (if we
ever allow them in CHECK or indexes), and it range-checks varattnos before
using them to index into the map array.
Per report from Sergey Konoplev. Back-patch to all supported branches.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Formerly, set_subquery_pathlist and other creators of plans for subqueries
saved only the rangetable and rowMarks lists from the lower-level
PlannerInfo. But there's no reason not to remember the whole PlannerInfo,
and indeed this turns out to simplify matters in a number of places.
The immediate reason for doing this was so that the subroot will still be
accessible when we're trying to extract column statistics out of an
already-planned subquery. But now that I've done it, it seems like a good
code-beautification effort in its own right.
I also chose to get rid of the transient subrtable and subrowmark fields in
SubqueryScan nodes, in favor of having setrefs.c look up the subquery's
RelOptInfo. That required changing all the APIs in setrefs.c to pass
PlannerInfo not PlannerGlobal, which was a large but quite mechanical
transformation.
One side-effect not foreseen at the beginning is that this finally broke
inheritance_planner's assumption that replanning the same subquery RTE N
times would necessarily give interchangeable results each time. That
assumption was always pretty risky, but now we really have to make a
separate RTE for each instance so that there's a place to carry the
separate subroots.
|
|
|
|
|
|
|
|
|
| |
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be. (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.) So an internal lookup
is both expensive and wrong.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per my recent proposal, get rid of all the direct inspection of indexes
and manual generation of paths in planagg.c. Instead, set up
EquivalenceClasses for the aggregate argument expressions, and let the
regular path generation logic deal with creating paths that can satisfy
those sort orders. This makes planagg.c a bit more visible to the rest
of the planner than it was originally, but the approach is basically a lot
cleaner than before. A major advantage of doing it this way is that we get
MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans)
practically for free, whereas in the old way we'd have had to add a whole
lot more duplicative logic.
One small disadvantage of this approach is that MIN/MAX aggregates can no
longer exploit partial indexes having an "x IS NOT NULL" predicate, unless
that restriction or something that implies it is specified in the query.
The previous implementation was able to use the added "x IS NOT NULL"
condition as an extra predicate proof condition, but in this version we
rely entirely on indexes that are considered usable by the main planning
process. That seems a fair tradeoff for the simplicity and functionality
gained.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete. The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update. So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.
In passing, this patch corrects the names of some columns in the
information_schema.triggers view. It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.
Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
it works just as well to have them be ordinary identifiers, and this gets rid
of a number of ugly special cases. Plus we aren't interfering with non-rule
usage of these names.
catversion bump because the names change internally in stored rules.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that's generated for a whole-row Var referencing the subquery, when the
subquery is in the nullable side of an outer join. The previous coding
instead put PlaceHolderVars around the elements of the RowExpr. The effect
was that when the outer join made the subquery outputs go to null, the
whole-row Var produced ROW(NULL,NULL,...) rather than just NULL. There
are arguments afoot about whether those things ought to be semantically
indistinguishable, but for the moment they are not entirely so, and the
planner needs to take care that its machinations preserve the difference.
Per bug #5025.
Making this feasible required refactoring ResolveNew() to allow more caller
control over what is substituted for a Var. I chose to make ResolveNew()
a wrapper around a new general-purpose function replace_rte_variables().
I also fixed the ancient bogosity that ResolveNew might fail to set
a query's hasSubLinks field after inserting a SubLink in it. Although
all current callers make sure that happens anyway, we've had bugs of that
sort before, and it seemed like a good time to install a proper solution.
Back-patch to 8.4. The problem can be demonstrated clear back to 8.0,
but the fix would be too invasive in earlier branches; not to mention
that people may be depending on the subtly-incorrect behavior. The
8.4 series is new enough that fixing this probably won't cause complaints,
but it might in older branches. Also, 8.4 shows the incorrect behavior
in more cases than older branches do, because it is able to flatten
subqueries in more cases.
|
|
|
|
| |
provided by Andrew.
|
|
|
|
|
|
|
|
|
| |
making pull_up_sublinks() construct a full-blown JoinExpr tree representation
of IN/EXISTS SubLinks that it is able to convert to semi or anti joins.
This makes pull_up_sublinks() a shade more complex, but the gain in semantic
clarity is worth it. I still have more to do in this area to address the
previously-discussed problems, but this commit in itself fixes at least one
bug in HEAD, as shown by added regression test case.
|
| |
|
|
|
|
| |
Hitoshi Harada, with some kibitzing from Heikki and Tom.
|
|
|
|
|
|
|
|
|
| |
return the tableoid as well as the ctid for any FOR UPDATE targets that
have child tables. All child tables are listed in the ExecRowMark list,
but the executor just skips the ones that didn't produce the current row.
Curiously, this longstanding restriction doesn't seem to have been documented
anywhere; so no doc changes.
|
|
|
|
|
|
|
|
|
|
| |
until vars are distributed to rels during query_planner() startup. We don't
really need it before that, and not building it early has some advantages.
First, we don't need to put it through the various preprocessing steps, which
saves some cycles and eliminates the need for a number of routines to support
PlaceHolderInfo nodes at all. Second, this means one less unused plan for any
sub-SELECT appearing in a placeholder's expression, since we don't build
placeholder_list until after sublink expansion is complete.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that represent some expression that we desire to compute below the top level
of the plan, and then let that value "bubble up" as though it were a plain
Var (ie, a column value).
The immediate application is to allow sub-selects to be flattened even when
they are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't properly
go to NULL when evaluated above the outer join. Now, we wrap it in a
PlaceHolderVar and arrange for the actual evaluation to occur below the outer
join. When the resulting Var bubbles up through the join, it will be set to
NULL if necessary, yielding the correct results. This fixes a planner
limitation that's existed since 7.1.
In future we might want to use this mechanism to re-introduce some form of
Hellerstein's "expensive functions" optimization, ie place the evaluation of
an expensive function at the most suitable point in the plan tree.
|
|
|
|
|
|
|
|
|
|
|
|
| |
the column alias names of the RTE referenced by the Var to the RowExpr.
This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes
referencing such a construct. Per my recent bug report.
Adding a field to RowExpr forces initdb (because of stored rules changes)
so this solution is not back-patchable; which is unfortunate because 8.2
and 8.3 have this issue. But it only affects EXPLAIN for some pretty odd
corner cases, so we can probably live without a solution for the back
branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.
There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly. But let's land
the patch now so we can get on with other development.
Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
|
|
|
|
|
| |
There are still some weak spots around JOIN USING and relation alias lists,
but most errors reported within backend/parser/ now have locations.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
most node types used in expression trees (both before and after parse
analysis). This allows us to place an error cursor in many situations
where we formerly could not, because the information wasn't available
beyond the very first level of parse analysis. There's a fair amount
of work still to be done to persuade individual ereport() calls to actually
include an error location, but this gets the initdb-forcing part of the
work out of the way; and the situation is already markedly better than
before for complaints about unimplementable implicit casts, such as
CASE and UNION constructs with incompatible alternative data types.
Per my proposal of a few days ago.
|
|
|
|
|
|
| |
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend. There's probably more that should be done along this line,
but this is a start anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
subqueries into the same thing you'd have gotten from IN (except always with
unknownEqFalse = true, so as to get the proper semantics for an EXISTS).
I believe this fixes the last case within CVS HEAD in which an EXISTS could
give worse performance than an equivalent IN subquery.
The tricky part of this is that if the upper query probes the EXISTS for only
a few rows, the hashing implementation can actually be worse than the default,
and therefore we need to make a cost-based decision about which way to use.
But at the time when the planner generates plans for subqueries, it doesn't
really know how many times the subquery will be executed. The least invasive
solution seems to be to generate both plans and postpone the choice until
execution. Therefore, in a query that has been optimized this way, EXPLAIN
will show two subplans for the EXISTS, of which only one will actually get
executed.
There is a lot more that could be done based on this infrastructure: in
particular it's interesting to consider switching to the hash plan if we start
out using the non-hashed plan but find a lot more upper rows going by than we
expected. I have therefore left some minor inefficiencies in place, such as
initializing both subplans even though we will currently only use one.
|
|
|
|
|
|
|
| |
parent, not only those with RangeTblRefs. We need them in ExecCheckRTPerms.
Report by Brendan O'Shea. Back-patch to 8.2, where pull_up_simple_union_all
was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the old JOIN_IN code, but antijoins are new functionality.) Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins. Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation. That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch. So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
null::char(3) to a simple Const node. (It already worked for non-null values,
but not when we skipped evaluation of a strict coercion function.) This
prevents loss of typmod knowledge in situations such as exhibited in bug
#3598. Unfortunately there seems no good way to fix that bug in 8.1 and 8.2,
because they simply don't carry a typmod for a plain Const node.
In passing I made all the other callers of makeNullConst supply "real" typmod
values too, though I think it probably doesn't matter anywhere else.
|
|
|
|
|
|
|
|
|
| |
Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once
have been a reason to disallow that, but it seems to work now, and it's
really rather necessary if you want to select a row via a cursor and then
update it in a concurrent-safe fashion.
Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.
|
|
|
|
| |
back-stamped for this.
|
| |
|
| |
|