| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we've nailed down the principle that NullTest with !argisrow
is fully equivalent to SQL's IS [NOT] DISTINCT FROM NULL, let's teach
the parser about it. This produces a slightly more compact parse tree
and is much more amenable to optimization than a DistinctExpr, since
the planner knows a good deal about NullTest and next to nothing about
DistinctExpr.
I'm not sure that there are all that many queries in the wild that could
be improved by this, but at least one source of such cases is the patch
just made to postgres_fdw to emit IS [NOT] DISTINCT FROM NULL when
IS [NOT] NULL isn't semantically correct.
No back-patch, since to the extent that this does affect planning results,
it might be considered undesirable plan destabilization.
|
|
|
|
|
|
|
| |
There's no point in consulting retval->paramMask; it's always NULL.
Instead, we should consult from->paramMask.
Reported by Andrew Gierth.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
|
|
|
|
| |
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f introduced new "upper"
RelOptInfo structures but didn't set consider_parallel for them
correctly, a point I completely missed when reviewing it. Later,
commit e06a38965b3bcdaa881e7e06892d4d8ab6c2c980 made the situation
worse by doing it incorrectly for the grouping relation. Try to
straighten all of that out. Along the way, get rid of the annoying
wholePlanParallelSafe flag, which was only necessarily because of
the fact that upper planning stages didn't use paths at the time
that code was written.
The most important immediate impact of these changes is that
force_parallel_mode will provide useful test coverage in quite a few
more scenarios than it did previously, but it's also necessary
preparation for fixing some problems related to subqueries.
Patch by me, reviewed by Tom Lane.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, these commands always planned the given query and went through
executor startup before deciding not to actually run the query if WITH NO
DATA is specified. This behavior is problematic for pg_dump because it
may cause errors to be raised that we would rather not see before a
REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907
from Marian Krucina. This change is not sufficient to fix that particular
bug, because we also need to tweak pg_dump to issue the REFRESH later,
but it's a necessary step on the way.
A user-visible side effect of doing things this way is that the returned
command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW"
or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior
but it would take more code, and arguably that was just an implementation
artifact not intended behavior anyhow.
In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which
was trouble waiting to happen; there is not any prohibition on nested
CREATE commands.
Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced.
Michael Paquier and Tom Lane
Report: <20160202161407.2778.24659@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original coding had three separate booleans representing partial
aggregation behavior, which was confusing, unreadable, and error-prone,
not least because the booleans weren't always listed in the same order.
It was also inadequate for the allegedly-desirable future extension to
support intermediate partial aggregation, because we'd need separate
markers for serialization and deserialization in such a case.
Merge these bools into an enum "AggSplit" to provide symbolic names for
the supported operating modes (and document what those are). By assigning
the values of the enum constants carefully, we can treat AggSplit values
as options bitmasks so that tests of what to do aren't noticeably more
expensive than before.
While at it, get rid of Aggref.aggoutputtype. That's not needed since
commit 59a3795c2 got rid of setrefs.c's special-purpose Aggref comparison
code, and it likewise seemed more confusing than helpful.
Assorted comment cleanup as well (there's still more that I want to do
in that line).
catversion bump for change in Aggref node contents. Should be the last
one for partial-aggregation changes.
Discussion: <29309.1466699160@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original upper-planner-pathification design (commit 3fc6e2d7f5b652b4)
assumed that we could always determine during Path formation whether or not
we would need a Result plan node to perform projection of a targetlist.
That turns out not to work very well, though, because createplan.c still
has some responsibilities for choosing the specific target list associated
with sorting/grouping nodes (in particular it might choose to add resjunk
columns for sorting). We might not ever refactor that --- doing so would
push more work into Path formation, which isn't attractive --- and we
certainly won't do so for 9.6. So, while create_projection_path and
apply_projection_to_path can tell for sure what will happen if the subpath
is projection-capable, they can't tell for sure when it isn't. This is at
least a latent bug in apply_projection_to_path, which might think it can
apply a target to a non-projecting node when the node will end up computing
something different.
Also, I'd tied the creation of a ProjectionPath node to whether or not a
Result is needed, but it turns out that we sometimes need a ProjectionPath
node anyway to avoid modifying a possibly-shared subpath node. Callers had
to use create_projection_path for such cases, and we added code to them
that knew about the potential omission of a Result node and attempted to
adjust the cost estimates for that. That was uncertainly correct and
definitely ugly/unmaintainable.
To fix, have create_projection_path explicitly check whether a Result
is needed and adjust its cost estimate accordingly, though it creates
a ProjectionPath in either case. apply_projection_to_path is now mostly
just an optimized version that can avoid creating an extra Path node when
the input is known to not be shared with any other live path. (There
is one case that create_projection_path doesn't handle, which is pushing
parallel-safe expressions below a Gather node. We could make it do that
by duplicating the GatherPath, but there seems no need as yet.)
create_projection_plan still has to recheck the tlist-match condition,
which means that if the matching situation does get changed by createplan.c
then we'll have made a slightly incorrect cost estimate. But there seems
no help for that in the near term, and I doubt it occurs often enough,
let alone would change planning decisions often enough, to be worth
stressing about.
I added a "dummypp" field to ProjectionPath to track whether
create_projection_path thinks a Result is needed. This is not really
necessary as-committed because create_projection_plan doesn't look at the
flag; but it seems like a good idea to remember what we thought when
forming the cost estimate, if only for debugging purposes.
In passing, get rid of the target_parallel parameter added to
apply_projection_to_path by commit 54f5c5150. I don't think that's a good
idea because it involves callers in what should be an internal decision,
and opens us up to missing optimization opportunities if callers think they
don't need to provide a valid flag, as most don't. For the moment, this
just costs us an extra has_parallel_hazard call when planning a Gather.
If that starts to look expensive, I think a better solution would be to
teach PathTarget to carry/cache knowledge of parallel-safety of its
contents.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides a new implementation of the logic added by commit
137805f89 and later removed by 77ba61080. It differs from the original
primarily in expending much less effort per joinrel in large queries,
which it accomplishes by doing most of the matching work once per query not
once per joinrel. Hopefully, it's also less buggy and better commented.
The never-documented enable_fkey_estimates GUC remains gone.
There remains work to be done to make the selectivity estimates account
for nulls in FK referencing columns; but that was true of the original
patch as well. We may be able to address this point later in beta.
In the meantime, any error should be in the direction of overestimating
rather than underestimating joinrel sizes, which seems like the direction
we want to err in.
Tomas Vondra and Tom Lane
Discussion: <31041.1465069446@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When doing partial aggregation, the args list of the upper (combining)
Aggref node is replaced by a Var representing the output of the partial
aggregation steps, which has either the aggregate's transition data type
or a serialized representation of that. However, nodeAgg.c blindly
continued to use the args list as an indication of the user-level argument
types. This broke resolution of polymorphic transition datatypes at
executor startup (though it accidentally failed to fail for the ANYARRAY
case, which is likely the only one anyone had tested). Moreover, the
constructed FuncExpr passed to the finalfunc contained completely wrong
information, which would have led to bogus answers or crashes for any case
where the finalfunc examined that information (which is only likely to be
with polymorphic aggregates using a non-polymorphic transition type).
As an independent bug, apply_partialaggref_adjustment neglected to resolve
a polymorphic transition datatype before assigning it as the output type
of the lower-level Aggref node. This again accidentally failed to fail
for ANYARRAY but would be unlikely to work in other cases.
To fix the first problem, record the user-level argument types in a
separate OID-list field of Aggref, and look to that rather than the args
list when asking what the argument types were. (It turns out to be
convenient to include any "direct" arguments in this list too, although
those are not currently subject to being overwritten.)
Rather than adding yet another resolve_aggregate_transtype() call to fix
the second problem, add an aggtranstype field to Aggref, and store the
resolved transition type OID there when the planner first computes it.
(By doing this in the planner and not the parser, we can allow the
aggregate's transition type to change from time to time, although no DDL
support yet exists for that.) This saves nothing of consequence for
simple non-polymorphic aggregates, but for polymorphic transition types
we save a catalog lookup during executor startup as well as several
planner lookups that are new in 9.6 due to parallel query planning.
In passing, fix an error that was introduced into count_agg_clauses_walker
some time ago: it was applying exprTypmod() to something that wasn't an
expression node at all, but a TargetEntry. exprTypmod silently returned
-1 so that there was not an obvious failure, but this broke the intended
sensitivity of aggregate space consumption estimates to the typmod of
varchar and similar data types. This part needs to be back-patched.
Catversion bump due to change of stored Aggref nodes.
Discussion: <8229.1466109074@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
| |
Commit b12fd41c6 added a "reltarget_has_non_vars" field to RelOptInfo,
but failed to maintain it accurately. Since its only purpose was to skip
calls to has_parallel_hazard() in the simple case where a rel's targetlist
is all Vars, and that call is really pretty cheap in that case anyway, it
seems like this is just a case of premature optimization. Let's drop the
flag and do the calls unconditionally until it's proven that we need more
smarts here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted by Andres Freund, we'd accumulated quite a few similar functions
in clauses.c that examine all functions in an expression tree to see if
they satisfy some boolean test. Reduce the duplication by inventing a
function check_functions_in_node() that applies a simple callback function
to each SQL function OID appearing in a given expression node. This also
fixes some arguable oversights; for example, contain_mutable_functions()
did not check aggregate or window functions for mutability. I doubt that
that represents a live bug at the moment, because we don't really consider
mutability for aggregates; but it might someday be one.
I chose to put check_functions_in_node() in nodeFuncs.c because it seemed
like other modules might wish to use it in future. That in turn forced
moving set_opfuncid() et al into nodeFuncs.c, as the alternative was for
nodeFuncs.c to depend on optimizer/setrefs.c which didn't seem very clean.
In passing, teach contain_leaked_vars_walker() about a few more expression
node types it can safely look through, and improve the rather messy and
undercommented code in has_parallel_hazard_walker().
Discussion: <20160527185853.ziol2os2zskahl7v@alap3.anarazel.de>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Such paths are unsafe. To make it cheaper to detect when this case
applies, track whether a relation's default PathTarget contains any
non-Vars. In most cases, the answer will be no, which enables us to
determine cheaply that the target list for a proposed path is
parallel-safe. However, subquery pull-up can create cases that
require us to inspect the target list more carefully.
Amit Kapila, reviewed by me.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This terminology provoked widespread complaints. So, instead, rename
the GUC max_parallel_degree to max_parallel_workers_per_gather
(leaving room for a possible future GUC max_parallel_workers that acts
as a system-wide limit), and rename the parallel_degree reloption to
parallel_workers. Rename structure members to match.
These changes create a dump/restore hazard for users of PostgreSQL
9.6beta1 who have set the reloption (or applied the GUC using ALTER
USER or ALTER DATABASE).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit reverts 137805f89 as well as the associated commits 015e88942,
5306df283, and 68d704edb. We found multiple bugs in this feature, and
there was concern about possible planner slowdown (though to be fair,
exhibiting a very large slowdown proved difficult). The way forward
requires a considerable rewrite, which may or may not be possible to
accomplish in time for beta2. In my judgment reviewing the rewrite will
be easier to accomplish starting from a clean slate, so let's temporarily
revert what's there now. This also leaves us in a safe state if it turns
out to be necessary to postpone the rewrite to the next development cycle.
Discussion: <20160429102531.GA13701@huehner.biz>
|
|
|
|
|
|
|
|
|
|
| |
Mostly these are just comments but there are a few in documentation
and a handful in code and tests. Hopefully this doesn't cause too much
unnecessary pain for backpatching. I relented from some of the most
common like "thru" for that reason. The rest don't seem numerous
enough to cause problems.
Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
|
|
|
|
| |
This is cosmetic, and no supported release has the affected fields.
|
|
|
|
|
|
|
|
|
| |
Needed for cases in which INSERT ... ON CONFLICT appears inside a
recursive CTE item. Per bug #14153 from Thomas Alton.
Patch by Peter Geoghegan, slightly adjusted by me
Report: <20160521232802.22598.13537@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If RAW_EXPRESSION_COVERAGE_TEST is defined, do a no-op tree walk over
every basic DML statement submitted to parse analysis. If we'd had this
in place earlier, bug #14153 would have been caught by buildfarm testing.
The difficulty is that raw_expression_tree_walker() is only used in
limited cases involving CTEs (particularly recursive ones), so it's
very easy for an oversight in it to not be noticed during testing of a
seemingly-unrelated feature.
The type of error we can expect to catch with this is complete omission
of a node type from raw_expression_tree_walker(), and perhaps also
recursion into a field that doesn't contain a node tree, though that
would be an unlikely mistake. It won't catch failure to add new fields
that need to be recursed into, unfortunately.
I'll go enable this on one or two of my own buildfarm animals once
bug #14153 is dealt with.
Discussion: <27861.1464040417@sss.pgh.pa.us>
|
|
|
|
| |
If you were wondering what editor I use, now you know.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In nodeFuncs.c, pgindent wants to introduce spurious indentation into
the definitions of planstate_tree_walker and planstate_walk_subplans.
Fix that by spreading the definition out across several lines, similar
to what is already done for other walker functions in that file.
In execParallel.c, in the definition of SharedExecutorInstrumentation,
pgindent wants to insert more whitespace between the type name and the
member name. That causes it to mangle comments later on the line. Fix
by moving the comments out of line. Now that we have a bit more room,
add some more details that may be useful to the next person reading
this code.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way that PartialAggregate and FinalizeAggregate plan nodes were
displaying output columns before was bogus. Now, FinalizeAggregate
produces the same outputs as an Aggregate would have produced, while
PartialAggregate produces each of those outputs prefixed by the word
PARTIAL.
Discussion: 12585.1460737650@sss.pgh.pa.us
Patch by me, reviewed by David Rowley.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously bcac23d exposed a subset of support functions, namely the
ones Kaigai found useful. In
20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de I mentioned that
there's some functions missing to use the facility in an external
project.
To avoid having to add functions piecemeal, add all the functions which
are used to define READ_* and WRITE_* macros; users of the extensible
node functionality are likely to need these. Additionally expose
outDatum(), which doesn't have it's own WRITE_ macro, as it needs
information from the embedding struct.
Discussion: 20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de
|
|
|
|
|
|
| |
It's not ready yet, revert two commits
690c543550b0d2852060c18d270cdb534d339d9a - unstable test output
386e3d7609c49505e079c40c65919d99feb82505 - patch itself
|
|
|
|
|
|
|
|
|
|
| |
Now indexes (but only B-tree for now) can contain "extra" column(s) which
doesn't participate in index structure, they are just stored in leaf
tuples. It allows to use index only scan by using single index instead
of two or more indexes.
Author: Anastasia Lubennikova with minor editorializing by me
Reviewers: David Rowley, Peter Geoghegan, Jeff Janes
|
|
|
|
|
|
|
| |
Fastpath ignores this if no triggers defined.
Author: Tomas Vondra, with fastpath and comments added by me
Reviewers: David Rowley, Simon Riggs
|
|
|
|
|
| |
Getting annoyed at the amount of unrelated chatter I get from pgindent'ing
Rowley's unique-joins patch. Re-indent all the files it touches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a new dependency type which marks an object as depending
on an extension, such that if the extension is dropped, the object
automatically goes away; and also, if the database is dumped, the object
is included in the dump output. Currently the grammar supports this for
indexes, triggers, materialized views and functions only, although the
utility code is generic so adding support for more object types is a
matter of touching the parser rules only.
Author: Abhijit Menon-Sen
Reviewed-by: Alexander Korotkov, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the planner would reject an index-only scan if any restriction
clause for its table used a column not available from the index, even
if that restriction clause would later be dropped from the plan entirely
because it's implied by the index's predicate. This is a fairly common
situation for partial indexes because predicates using columns not included
in the index are often the most useful kind of predicate, and we have to
duplicate (or at least imply) the predicate in the WHERE clause in order
to get the index to be considered at all. So index-only scans were
essentially unavailable with such partial indexes.
To fix, we have to do detection of implied-by-predicate clauses much
earlier in the planner. This patch puts it in check_index_predicates
(nee check_partial_indexes), meaning it gets done for every partial index,
whereas we previously only considered this issue at createplan time,
so that the work was only done for an index actually selected for use.
That could result in a noticeable planning slowdown for queries against
tables with many partial indexes. However, testing suggested that there
isn't really a significant cost, especially not with reasonable numbers
of partial indexes. We do get a small additional benefit, which is that
cost_index is more accurate since it correctly discounts the evaluation
cost of clauses that will be removed. We can also avoid considering such
clauses as potential indexquals, which saves useless matching cycles in
the case where the predicate columns aren't in the index, and prevents
generating bogus plans that double-count the clause's selectivity when
the columns are in the index.
Tomas Vondra and Kyotaro Horiguchi, reviewed by Kevin Grittner and
Konstantin Knizhnik, and whacked around a little by me
|
|
|
|
|
|
|
|
|
| |
This is necessary infrastructure for supporting parallel aggregation
for aggregates whose transition type is "internal". Such values
can't be passed between cooperating processes, because they are
just pointers.
David Rowley, reviewed by Tomas Vondra and by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per discussion, the new extensible node framework is thought to be
better designed than the custom path/scan/scanstate stuff we added
in PostgreSQL 9.5. Rework the latter to be more like the former.
This is not backward-compatible, but we generally don't promise that
for C APIs, and there probably aren't many people using this yet
anyway.
KaiGai Kohei, reviewed by Petr Jelinek and me. Some further
cosmetic changes by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables external code to create access methods. This is useful so
that extensions can add their own access methods which can be formally
tracked for dependencies, so that DROP operates correctly. Also, having
explicit support makes pg_dump work correctly.
Currently only index AMs are supported, but we expect different types to
be added in the future.
Authors: Alexander Korotkov, Petr Jelínek
Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby
Commitfest-URL: https://commitfest.postgresql.org/9/353/
Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Parallel workers can now partially aggregate the data and pass the
transition values back to the leader, which can combine the partial
results to produce the final answer.
David Rowley, based on earlier work by Haribabu Kommi. Reviewed by
Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.
|
|
|
|
|
| |
Blatant fail in 0bf3ae88af330496517722e391e7c975e6bad219.
Caught by buildfarm member mandrill.
|
|
|
|
|
|
|
|
|
| |
postgres_fdw can now sent an UPDATE or DELETE statement directly to
the foreign server in simple cases, rather than sending a SELECT FOR
UPDATE statement and then updating or deleting rows one-by-one.
Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro
Horiguchi, Albe Laurenz, Thom Brown, and me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 19a541143a09c067 I did not make PathTarget a subtype of Node,
and embedded a RelOptInfo's reltarget directly into it rather than having
a separately-allocated Node. In hindsight that was misguided
micro-optimization, enabled by the fact that at that point we didn't have
any Paths with custom PathTargets. Now that PathTarget processing has
been fleshed out some more, it's easier to see that it's better to have
PathTarget as an indepedent Node type, even if it does cost us one more
palloc to create a RelOptInfo. So change it while we still can.
This commit just changes the representation, without doing anything more
interesting than that.
|
|
|
|
|
|
|
|
|
|
| |
The old code is bad for two reasons. First, it has an off-by-one
error. Second, it won't help if you aren't running with assertions
enabled. Per discussion, we want a check here in that case too.
Author: KaiGai Kohei, adjusted by me.
Reviewed-by: Petr Jelinek
Discussion: 56E0D547.1030101@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of having planner.c compute a groupColIdx array and store it in
GroupingSetsPaths, make create_groupingsets_plan() find the grouping
columns by searching in the child plan node's tlist. Although that's
probably a bit slower for create_groupingsets_plan(), it's more like
the way every other plan node type does this, and it provides positive
confirmation that we know which child output columns we're supposed to be
grouping on. (Indeed, looking at this now, I'm not at all sure that it
wasn't broken before, because create_groupingsets_plan() isn't demanding
an exact tlist match from its child node.) Also, this allows substantial
simplification in planner.c, because it no longer needs to compute the
groupColIdx array at all; no other cases were using it.
I'd intended to put off this refactoring until later (like 9.7), but
in view of the likely bug fix and the need to rationalize planner.c's
tlist handling so we can do something sane with Konstantin Knizhnik's
function-evaluation-postponement patch, I think it can't wait.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
|
|
|
|
|
|
|
| |
I believe that I (rhaas) introduced this bug while editing the patch
that became bcac23de73b89b001fbc628d84471a392e928d1c.
Report and patch from KaiGai Kohei.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up to now, there's been an assumption that all Paths for a given relation
compute the same output column set (targetlist). However, there are good
reasons to remove that assumption. For example, an indexscan on an
expression index might be able to return the value of an expensive function
"for free". While we have the ability to generate such a plan today in
simple cases, we don't have a way to model that it's cheaper than a plan
that computes the function from scratch, nor a way to create such a plan
in join cases (where the function computation would normally happen at
the topmost join node). Also, we need this so that we can have Paths
representing post-scan/join steps, where the targetlist may well change
from one step to the next. Therefore, invent a "struct PathTarget"
representing the columns we expect a plan step to emit. It's convenient
to include the output tuple width and tlist evaluation cost in this struct,
and there will likely be additional fields in future.
While Path nodes that actually do have custom outputs will need their own
PathTargets, it will still be true that most Paths for a given relation
will compute the same tlist. To reduce the overhead added by this patch,
keep a "default PathTarget" in RelOptInfo, and allow Paths that compute
that column set to just point to their parent RelOptInfo's reltarget.
(In the patch as committed, actually every Path is like that, since we
do not yet have any cases of custom PathTargets.)
I took this opportunity to provide some more-honest costing of
PlaceHolderVar evaluation. Up to now, the assumption that "scan/join
reltargetlists have cost zero" was applied not only to Vars, where it's
reasonable, but also PlaceHolderVars where it isn't. Now, we add the eval
cost of a PlaceHolderVar's expression to the first plan level where it can
be computed, by including it in the PathTarget cost field and adding that
to the cost estimates for Paths. This isn't perfect yet but it's much
better than before, and there is a way forward to improve it more. This
costing change affects the join order chosen for a couple of the regression
tests, changing expected row ordering.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An extensible node is always tagged T_Extensible, but the extnodename
field identifies it more specifically; it may also include arbitrary
private data. Extensible nodes can be copied, tested for equality,
serialized, and deserialized, but the core system doesn't know
anything about them otherwise. Some extensions may find it useful to
include these nodes in fdw_private or custom_private lists in lieu of
arm-wrestling their data into a format that the core code can
understand.
Along the way, so as not to burden the authors of such extensible
node types too much, expose the functions for writing serialized
tokens, and for serializing and deserializing bitmapsets.
KaiGai Kohei, per a design suggested by me. Reviewed by Andres Freund
and by me, and further edited by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When force_parallel_mode = true, we enable the parallel mode restrictions
for all queries for which this is believed to be safe. For the subset of
those queries believed to be safe to run entirely within a worker, we spin
up a worker and run the query there instead of running it in the
original process. When force_parallel_mode = regress, make additional
changes to allow the regression tests to run cleanly even though parallel
workers have been injected under the hood.
Taken together, this facilitates both better user testing and better
regression testing of the parallelism code.
Robert Haas, with help from Amit Kapila and Rushabh Lathia.
|
|
|
|
|
|
|
|
|
|
| |
You can't really do anything useful with this in the form it currently
exists; among other problems, there's no way to reread whatever
information might be produced when the path is output. Work is
underway to replace this with a more useful and more general system of
extensible nodes, but let's start by getting rid of this bit.
Extracted from a larger patch by KaiGai Kohei.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the foreign join pushdown infrastructure left the question
of security entirely up to individual FDWs, but it would be easy for
a foreign data wrapper to inadvertently open up subtle security holes
that way. So, make it the core code's job to determine which user
mapping OID is relevant, and don't attempt join pushdown unless it's
the same for all relevant relations.
Per a suggestion from Tom Lane. Shigeru Hanada and Ashutosh Bapat,
reviewed by Etsuro Fujita and KaiGai Kohei, with some further
changes by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Putting a reference to an expanded-format value into a Const node would be
a bad idea for a couple of reasons. It'd be possible for the supposedly
immutable Const to change value, if something modified the referenced
variable ... in fact, if the Const's reference were R/W, any function that
has the Const as argument might itself change it at runtime. Also, because
datumIsEqual() is pretty simplistic, the Const might fail to compare equal
to other Consts that it should compare equal to, notably including copies
of itself. This could lead to unexpected planner behavior, such as "could
not find pathkey item to sort" errors or inferior plans.
I have not been able to find any way to get an expanded value into a Const
within the existing core code; but Paul Ramsey was able to trigger the
problem by writing a datatype input function that returns an expanded
value.
The best fix seems to be to establish a rule that varlena values being
placed into Const nodes should be passed through pg_detoast_datum().
That will do nothing (and cost little) in normal cases, but it will flatten
expanded values and thereby avoid the above problems. Also, it will
convert short-header or compressed values into canonical format, which will
avoid possible unexpected lack-of-equality issues for those cases too.
And it provides a last-ditch defense against putting a toasted value into
a Const, which we already knew was dangerous, cf commit 2b0c86b66563cf2f.
(In the light of this discussion, I'm no longer sure that that commit
provided 100% protection against such cases, but this fix should do it.)
The test added in commit 65c3d05e18e7c530 to catch datatype input functions
with unstable results would fail for functions that returned expanded
values; but it seems a bit uncharitable to deem a result unstable just
because it's expressed in expanded form, so revise the coding so that we
check for bitwise equality only after applying pg_detoast_datum(). That's
a sufficient condition anyway given the new rule about detoasting when
forming a Const.
Back-patch to 9.5 where the expanded-object facility was added. It's
possible that this should go back further; but in the absence of clear
evidence that there's any live bug in older branches, I'll refrain for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The core innovation of this patch is the introduction of the concept
of a partial path; that is, a path which if executed in parallel will
generate a subset of the output rows in each process. Gathering a
partial path produces an ordinary (complete) path. This allows us to
generate paths for parallel joins by joining a partial path for one
side (which at the baserel level is currently always a Partial Seq
Scan) to an ordinary path on the other side. This is subject to
various restrictions at present, especially that this strategy seems
unlikely to be sensible for merge joins, so only nested loops and
hash joins paths are generated.
This also allows an Append node to be pushed below a Gather node in
the case of a partitioned table.
Testing revealed that early versions of this patch made poor decisions
in some cases, which turned out to be caused by the fact that the
original cost model for Parallel Seq Scan wasn't very good. So this
patch tries to make some modest improvements in that area.
There is much more to be done in the area of generating good parallel
plans in all cases, but this seems like a useful step forward.
Patch by me, reviewed by Dilip Kumar and Amit Kapila.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aggregate nodes now have two new modes: a "partial" mode where they
output the unfinalized transition state, and a "finalize" mode where
they accept unfinalized transition states rather than individual
values as input.
These new modes are not used anywhere yet, but they will be necessary
for parallel aggregation. The infrastructure also figures to be
useful for cases where we want to aggregate local data and remote
data via the FDW interface, and want to bring back partial aggregates
from the remote side that can then be combined with locally generated
partial aggregates to produce the final value. It may also be useful
even when neither FDWs nor parallelism are in play, as explained in
the comments in nodeAgg.c.
David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
Linnakangas, Haribabu Kommi, and me.
|