| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When querying a partitioned table containing a default partition, we
were wrongly deciding to include it in the scan too early in the
process, failing to exclude it in some cases. If we reinterpret the
PruneStepResult.scan_default flag slightly, we can do a better job at
detecting that it can be excluded. The change is that we avoid setting
the flag for that pruning step unless the step absolutely requires the
default partition to be scanned (in contrast with the previous
arrangement, which was to set it unless the step was able to prune it).
So get_matching_partitions() must explicitly check the partition that
each returned bound value corresponds to in order to determine whether
the default one needs to be included, rather than relying on the flag
from the final step result.
Author: Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp>
Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Discussion: https://postgr.es/m/00e601d4ca86$932b8bc0$b982a340$@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 5f32b29c1819 I changed the creation of HashState.hashkeys to
actually use HashState as the parent (instead of HashJoinState, which
was incorrect, as they were executed below HashState), to fix the
problem of hashkeys expressions otherwise relying on slot types
appropriate for HashJoinState, rather than HashState as would be
correct. That reliance was only introduced in 12, which is why it
previously worked to use HashJoinState as the parent (although I'd be
unsurprised if there were problematic cases).
Unfortunately that's not a sufficient solution, because before this
commit, the to-be-hashed expressions referenced inner/outer as
appropriate for the HashJoin, not Hash. That didn't have obvious bad
consequences, because the slots containing the tuples were put into
ecxt_innertuple when hashing a tuple for HashState (even though Hash
doesn't have an inner plan).
There are less common cases where this can cause visible problems
however (rather than just confusion when inspecting such executor
trees). E.g. "ERROR: bogus varno: 65000", when explaining queries
containing a HashJoin where the subsidiary Hash node's hash keys
reference a subplan. While normally hashkeys aren't displayed by
EXPLAIN, if one of those expressions references a subplan, that
subplan may be printed as part of the Hash node - which then failed
because an inner plan was referenced, and Hash doesn't have that.
It seems quite possible that there's other broken cases, too.
Fix the problem by properly splitting the expression for the HashJoin
and Hash nodes at plan time, and have them reference the proper
subsidiary node. While other workarounds are possible, fixing this
correctly seems easy enough. It was a pretty ugly hack to have
ExecInitHashJoin put the expression into the already initialized
HashState, in the first place.
I decided to not just split inner/outer hashkeys inside
make_hashjoin(), but also to separate out hashoperators and
hashcollations at plan time. Otherwise we would have ended up having
two very similar loops, one at plan time and the other during executor
startup. The work seems to more appropriately belong to plan time,
anyway.
Reported-By: Nikita Glukhov, Alexander Korotkov
Author: Andres Freund
Reviewed-By: Tom Lane, in an earlier version
Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com
Backpatch: 12-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These were introduced by pgindent due to fixe to broken
indentation (c.f. 8255c7a5eeba8). Previously the mis-indentation of
function prototypes was creatively used to reduce indentation in a few
places.
As that formatting only exists in master and REL_12_STABLE, it seems
better to fix it in both, rather than having some odd indentation in
v12 that somebody might copy for future patches or such.
Author: Andres Freund
Discussion: https://postgr.es/m/20190728013754.jwcbe5nfyt3533vx@alap3.anarazel.de
Backpatch: 12-
|
|
|
|
|
|
|
|
|
|
|
| |
The rd_amcache allows an index AM to cache arbitrary information in a
relcache entry. This commit moves the cleanup of rd_amcache so that it
can also be used by table AMs. Nothing takes advantage of that yet, but
I'm sure it'll come handy for anyone writing new table AMs.
Backpatch to v12, where table AM interface was introduced.
Reviewed-by: Julien Rouhaud
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When performing ANALYZE on inheritance trees, we collect two samples for
each relation - one for the relation alone, and one for the inheritance
subtree (relation and its child relations). And then we build statistics
on each sample, so for each relation we get two sets of statistics.
For regular (per-column) statistics this works fine, because the catalog
includes a flag differentiating statistics built from those two samples.
But we don't have such flag in the extended statistics catalogs, and we
ended up updating the same row twice, triggering this error:
ERROR: tuple already updated by self
The simplest solution is to disable extended statistics on inheritance
trees, which is what this commit is doing. In the future we may need to
do something similar to per-column statistics, but that requires adding a
flag to the catalog - and that's not backpatchable. Moreover, the current
selectivity estimation code only works with individual relations, so
building statistics on inheritance trees would be pointless anyway.
Author: Tomas Vondra
Backpatch-to: 10-
Discussion: https://postgr.es/m/20190618231233.GA27470@telsasoft.com
Reported-by: Justin Pryzby
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A "break" statement erroneously left behind by commit a1c1af2a1
caused TopoSort to do the wrong thing if a lock's wait list
contained multiple members of the same locking group.
Because parallel workers don't normally need any locks not already
taken by their leader, this is very hard --- maybe impossible ---
to hit in production. Still, if it did happen, the queries involved
in an otherwise-resolvable deadlock would block until canceled.
In addition to removing the bogus "break", add an Assert showing
that the conflicting uses of the beforeConstraints[] array (for both
counts and flags) don't overlap, and add some commentary explaining
why not; because it's not obvious without explanation, IMHO.
Original report and patch from Rui Hai Jiang; additional assert
and commentary by me. Back-patch to 9.6 where the bug came in.
Discussion: https://postgr.es/m/CAEri+mLd3bpHLyW+a9pSe1y=aEkeuJpwBSwvo-+m4n7-ceRmXw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When copying the definition of an index rebuilt concurrently for the new
entry, the index information was taken directly from the old index using
the relation cache. In this case, predicates and expressions have
some post-processing to prepare things for the planner, which loses some
information including the collations added in any of them.
This inconsistency can cause issues when attempting for example a table
rewrite, and makes the new indexes rebuilt concurrently inconsistent
with the old entries.
In order to fix the problem, fetch expressions and predicates directly
from the catalog of the old entry, and fill in IndexInfo for the new
index with that. This makes the process more consistent with
DefineIndex(), and the code is refactored with the addition of a routine
to create an IndexInfo node.
Reported-by: Manuel Rigger
Author: Michael Paquier
Discussion: https://postgr.es/m/CA+u7OA5Hp0ra235F3czPom_FyAd-3+XwSJmX95r1+sRPOJc9VQ@mail.gmail.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
| |
Early previews of LLVM 9 reveal that our Min() macro causes compiler
errors in LLVM headers reached by the #include directives in
llvmjit_inline.cpp. Let's just undefine it. Per buildfarm animal
seawasp. Back-patch to 11.
Reviewed-by: Fabien Coelho, Tom Lane
Discussion: https://postgr.es/m/20190606173216.GA6306%40alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pg_timezone_names() tries to avoid showing the "Factory" zone in
the view, mainly because that has traditionally had a very long
"abbreviation" such as "Local time zone must be set--see zic manual page",
so that showing it messes up psql's formatting of the whole view.
Since tzdb version 2016g, IANA instead uses the abbreviation "-00",
which is sane enough that there's no reason to discriminate against it.
On the other hand, it emerges that FreeBSD and possibly other packagers
are so wedded to backwards compatibility that they hack the IANA data
to keep the old spelling --- and not just that old spelling, but even
older spellings that IANA used back in the stone age. This caused the
filter logic to fail to suppress "Factory" at all on such platforms,
though the formatting problem is definitely real in that case.
To solve both problems, get rid of the hard-wired assumption about
exactly what Factory's abbreviation is, and instead reject abbreviations
exceeding 31 characters. This will allow Factory to appear in the view
if and only if it's using the modern abbreviation.
In passing, simplify the code we add to zic.c to support "zic -P"
to remove its now-obsolete hacks to not print the Factory zone's
abbreviation. Unlike pg_timezone_names(), there's no reason for
that code to support old/nonstandard timezone data.
Since we generally prefer to keep timezone-related behavior the
same in all branches, and since this is arguably a bug fix,
back-patch to all supported branches.
Discussion: https://postgr.es/m/3961.1564086915@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Money values exceeding about 18 digits (depending on lc_monetary)
could be inaccurately converted to numeric, due to select_div_scale()
deciding it didn't need to compute any fractional digits. Force
its hand by setting the dscale of one division input to equal the
number of fractional digits we need.
In passing, rearrange the logic to not do useless work in locales
where money values are considered integral.
Per bug #15925 from Slawomir Chodnicki. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/15925-da9953e2674bb5c8@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 15d8f8312 we assert that - and since 7ef04e4d2cb2, 4da597edf1
rely on - the slot type for an expression's
ecxt_{outer,inner,scan}tuple not changing, unless explicitly flagged
as such. That allows to either skip deforming (for a virtual tuple
slot) or optimize the code for JIT accelerated deforming
appropriately (for other known slot types).
This assumption was sometimes violated for grouping sets, when
nodeAgg.c internally uses tuplesorts, and the child node doesn't
return a TTSOpsMinimalTuple type slot. Detect that case, and flag that
the outer slot might not be "fixed".
It's probably worthwhile to optimize this further in the future, and
more granularly determine whether the slot is fixed. As we already
instantiate per-phase transition and equal expressions, we could
cheaply set the slot type appropriately for each phase. But that's a
separate change from this bugfix.
This commit does include a very minor optimization by avoiding to
create a slot for handling tuplesorts, if no such sorts are
performed. Previously we created that slot unnecessarily in the common
case of computing all grouping sets via hashing. The code looked too
confusing without that, as the conditions for needing a sort slot and
flagging that the slot type isn't fixed, are the same.
Reported-By: Ashutosh Sharma
Author: Andres Freund
Discussion: https://postgr.es/m/CAE9k0PmNaMD2oHTEAhRyxnxpaDaYkuBYkLa1dpOpn=RS0iS2AQ@mail.gmail.com
Backpatch: 12-, where the bug was introduced in 15d8f8312
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libpq failed to ignore Windows-style newlines in connection service files.
This normally wasn't a problem on Windows itself, because fgets() would
convert \r\n to just \n. But if libpq were running inside a program that
changes the default fopen mode to binary, it would see the \r's and think
they were data. In any case, it's project policy to ignore \r in text
files unconditionally, because people sometimes try to use files with
DOS-style newlines on Unix machines, where the C library won't hide that
from us.
Hence, adjust parseServiceFile() to ignore \r as well as \n at the end of
the line. In HEAD, go a little further and make it ignore all trailing
whitespace, to match what it's always done with leading whitespace.
In HEAD, also run around and fix up everyplace where we have
newline-chomping code to make all those places look consistent and
uniformly drop \r. It is not clear whether any of those changes are
fixing live bugs. Most of the non-cosmetic changes are in places that
are reading popen output, and the jury is still out as to whether popen
on Windows can return \r\n. (The Windows-specific code in pipe_read_line
seems to think so, but our lack of support for this elsewhere suggests
maybe it's not a problem in practice.) Hence, I desisted from applying
those changes to back branches, except in run_ssl_passphrase_command()
which is new enough and little-tested enough that we'd probably not have
heard about any problems there.
Tom Lane and Michael Paquier, per bug #15827 from Jorge Gustavo Rocha.
Back-patch the parseServiceFile() change to all supported branches,
and the run_ssl_passphrase_command() change to v11 where that was added.
Discussion: https://postgr.es/m/15827-e6ba53a3a7ed543c@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After 277cb789836 ON CONFLICT ... SET ... RETURNING failed with
ERROR: virtual tuple table slot does not have system attributes
when taking the update path, as the slot used to insert into the
table (and then process RETURNING) was defined to be a virtual slot in
that commit. Virtual slots don't support system columns except for
tableoid and ctid, as the other system columns are AM dependent.
Fix that by using a slot of the table's type. Add tests for system
column accesses in ON CONFLICT ... RETURNING.
Reported-By: Roby, bisected to the relevant commit by Jeff Janes
Author: Andres Freund
Discussion: https://postgr.es/m/73436355-6432-49B1-92ED-1FE4F7E7E100@finefun.com.au
Backpatch: 12-, where the bug was introduced in 277cb789836
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, after a deleted page gets even older, it becomes unrecyclable
again. B-tree has the same problem, and has had since time immemorial,
but let's at least fix this in GiST, where this is new.
Backpatch to v12, where GiST page deletion was introduced.
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru
|
|
|
|
|
|
|
|
|
|
|
|
| |
The explicit check in gistScanPage() isn't currently really necessary, as
a deleted page is always empty, so the loop would fall through without
doing anything, anyway. But it's a marginal optimization, and it gives a
nice place to attach a comment to explain how it works.
Backpatch to v12, where GiST page deletion was introduced.
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the user creates a deferred constraint in a partition, and in a
transaction they cause the constraint's trigger execution to be deferred
until commit time *and* drop the constraint, then when commit time comes
the queued trigger will fail to run because the trigger object will have
been dropped.
This is explained because when a constraint gets dropped in a
partitioned table, the recursion to drop the ones in partitions is done
by the dependency mechanism, not by ALTER TABLE traversing the recursion
tree as in all other cases. In the non-partitioned case, this problem
is avoided by checking that the table is not "in use" by alter-table;
other alter-table subcommands that recurse to partitions do that check
for each partition. But the dependency mechanism doesn't have a way to
do that. Fix the problem by applying the same check to all partitions
during ALTER TABLE's "prep" phase, which correctly raises the necessary
error.
Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>
Discussion: https://postgr.es/m/CAKcux6nZiO9-eEpr1ZD84bT1mBoVmeZkfont8iSpcmYrjhGWgA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic in ATExecDropColumn that rejects dropping partition key
columns is quite an inadequate defense, because it doesn't execute
in cases where a column needs to be dropped due to cascade from
something that only the column, not the whole partitioned table,
depends on. That leaves us with a badly broken partitioned table;
even an attempt to load its relcache entry will fail.
We really need to have explicit pg_depend entries that show that the
column can't be dropped without dropping the whole table. Hence,
add those entries. In v12 and HEAD, bump catversion to ensure that
partitioned tables will have such entries. We can't do that in
released branches of course, so in v10 and v11 this patch affords
protection only to partitioned tables created after the patch is
installed. Given the lack of field complaints (this bug was found
by fuzz-testing not by end users), that's probably good enough.
In passing, fix ATExecDropColumn and ATPrepAlterColumnType
messages to be more specific about which partition key column
they're complaining about.
Per report from Manuel Rigger. Back-patch to v10 where partitioned
tables were added.
Discussion: https://postgr.es/m/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com
Discussion: https://postgr.es/m/31920.1562526703@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current extended statistics code was a bit confused which collation
to use. When building the statistics, the collations defined as default
for the data types were used (since commit 5e0928005). The MCV code was
however using the column collations for MCV serialization, and then
DEFAULT_COLLATION_OID when computing estimates. So overall the code was
using all three possible options, inconsistently.
This uses the column colation everywhere - this makes it consistent with
what 5e0928005 did for regular stats. We however do not track the
collations in a catalog, because we can derive them from column-level
information. This may need to change in the future, e.g. after allowing
statistics on expressions.
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The examine_opclause_expression function needs to return information on
which side of the operator we found the Var, but the variable was called
"isgt" which is rather misleading (it assumes the operator is either
less-than or greater-than, but it may be equality or something else).
Other places in the planner use a variable called "varonleft" for this
purpose, so just adopt the same convention here.
The code also assumed we don't care about this flag for equality, as
(Var = Const) and (Const = Var) should be the same thing. But that does
not work for cross-type operators, in which case we need to pass the
parameters to the procedure in the right order. So just use the same
code for all types of expressions.
This means we don't need to care about the selectivity estimation
function anymore, at least not in this code. We should only get the
supported cases here (thanks to statext_is_compatible_clause).
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12
|
|
|
|
|
|
|
| |
I was careless passing a datum directly to DATE_NOT_FINITE without
calling DatumGetDateADT() first.
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values 'infinity' and '-infinity' are a part of the DATE type
itself, so a bound of the date 'infinity' is not the same as an
unbounded/infinite range. However, it is still wrong to try to
canonicalize such values, because adding or subtracting one has no
effect. Fix by treating 'infinity' and '-infinity' the same as
unbounded ranges for the purposes of canonicalization (but not other
purposes).
Backpatch to all versions because it is inconsistent with the
documented behavior. Note that this could be an incompatibility for
applications relying on the behavior contrary to the documentation.
Author: Laurenz Albe
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/77f24ea19ab802bc9bc60ddbb8977ee2d646aec1.camel%40cybertec.at
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 857f9c36cda, which taught nbtree VACUUM to avoid unnecessary
index scans, bumped the nbtree version number from 2 to 3, while adding
the ability for nbtree indexes to be upgraded on-the-fly. Various
assertions that assumed that an nbtree index was always on version 2 had
to be changed to accept any supported version (version 2 or 3 on
Postgres 11).
However, a few assertions were missed in the initial commit, all of
which were in code paths that cache a local copy of the metapage
metadata, where the index had been expected to be on the current version
(no longer version 2) as a generic sanity check. Rather than simply
update the assertions, follow-up commit 0a64b45152b intentionally made
the metapage caching code update the per-backend cached metadata version
without changing the on-disk version at the same time. This could even
happen when the planner needed to determine the height of a B-Tree for
costing purposes. The assertions only fail on Postgres v12 when
upgrading from v10, because they were adjusted to use the authoritative
shared memory metapage by v12's commit dd299df8.
To fix, remove the cache-only upgrade mechanism entirely, and update the
assertions themselves to accept any supported version (go back to using
the cached version in v12). The fix is almost a full revert of commit
0a64b45152b on the v11 branch.
VACUUM only considers the authoritative metapage, and never bothers with
a locally cached version, whereas everywhere else isn't interested in
the metapage fields that were added by commit 857f9c36cda. It seems
unlikely that this bug has affected any user on v11.
Reported-By: Christoph Berg
Bug: #15896
Discussion: https://postgr.es/m/15896-5b25e260fdb0b081%40postgresql.org
Backpatch: 11-, where VACUUM was taught to avoid unnecessary index scans.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When evaluating clauses on a multivariate MCV list, we build a bitmap
tracking how the clauses match each item of the MCV list. When updating
the bitmap we need to consider the current value (tracking how the item
matches preceding clauses), match for the current clause and whether the
clauses are connected by AND or OR.
Until now the logic was copied on every place updating the bitmap, which
was not quite readable. So just move it to a separate function and call
it where needed.
Backpatch to 12, where the code was introduced. While not a bugfix, this
should make maintenance and future backpatches easier.
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two issues in how the extended statistics handled NULL values
in opclauses. Firstly, the code was oblivious to the possibility that
Const may be NULL (constisnull=true) in which case the constvalue is
undefined. We need to treat this as a mismatch, and not call the proc.
Secondly, the MCV item itself may contain NULL values too - the code
already did check that, and updated the match bitmap accordingly, but
failed to ensure we won't call the operator procedure anyway. It did
work for AND-clauses, because in that case false in the bitmap stops
evaluation of further clauses. But for OR-clauses ir was not easy to
get incorrect estimates or even trigger a crash.
This fixes both issues by extending the existing check so that it looks
at constisnull too, and making sure it skips calling the procedure.
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We expect opclauses to have exactly one Var and one Const, but the code
was checking the Const by calling is_pseudo_constant_clause() which is
incorrect - we need a proper constant.
Fixed by using plain IsA(x,Const) to check type of the node. We need to
do these checks in two places, so move it into a separate function that
can be called in both places.
Reported by Andreas Seltenreich, based on crash reported by sqlsmith.
Backpatch to v12, where this code was introduced.
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12
|
|
|
|
|
|
|
|
|
|
|
| |
The TYPECACHE_GT_OPR is not needed (it used to be in older version of
the MCV code), but the compiler failed to detect this as the result was
used in a fmgr_info() call, populating a FmgrInfo entry.
Backpatch to v12, where this code was introduced.
Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu
Backpatch-to: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should lappend the OIDs, not lcons them; the existing code produced
a list in reversed order. This is harmless for single-key FKs or FKs
where all the key columns are of the same type, which probably explains
how it went unnoticed. But if those conditions are not met,
ATAddForeignKeyConstraint would make the wrong decision about whether an
existing FK needs to be revalidated. I think it would almost always err
in the safe direction by revalidating a constraint that didn't need it.
You could imagine scenarios where the pfeqop check was fooled by
swapping the types of two FK columns in one ALTER TABLE, but that case
would probably be rejected by other tests, so it might be impossible to
get to the worst-case scenario where an FK should be revalidated and
isn't. (And even then, it's likely to be fine, unless there are weird
inconsistencies in the equality behavior of the replacement types.)
However, this is a performance bug at least.
Noted while poking around to see whether lcons calls could be converted
to lappend.
This bug is old, dating to commit cb3a7c2b9, so back-patch to all
supported branches.
|
|
|
|
|
|
|
|
|
|
| |
The logic just added by commit e3899ffd falls back on a 50:50 page split
in the event of a new item that's just to the right of our provisional
"many duplicates" split point. Fix a comment that incorrectly claimed
that the new item had to be just to the left of our provisional split
point.
Backpatch: 12-, just like commit e3899ffd.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specific ever-decreasing insertion patterns could cause successive
unbalanced nbtree page splits. Problem cases involve a large group of
duplicates to the left, and ever-decreasing insertions to the right.
To fix, detect the situation by considering the newitem offset before
performing a split using nbtsplitloc.c's "many duplicates" strategy. If
the new item was inserted just to the right of our provisional "many
duplicates" split point, infer ever-decreasing insertions and fall back
on a 50:50 (space delta optimal) split. This seems to barely affect
cases that already had acceptable space utilization.
An alternative fix also seems possible. Instead of changing
nbtsplitloc.c split choice logic, we could instead teach _bt_truncate()
to generate a new value for new high keys by interpolating from the
lastleft and firstright key values. That would certainly be a more
elegant fix, but it isn't suitable for backpatching.
Discussion: https://postgr.es/m/CAH2-WznCNvhZpxa__GqAa1fgQ9uYdVc=_apArkW2nc-K3O7_NA@mail.gmail.com
Backpatch: 12-, where the nbtree page split enhancements were introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3ca930fc3 modified get_actual_variable_range() to use a new
"SnapshotNonVacuumable" snapshot type for selecting tuples that it
would consider valid. However, because that snapshot type can accept
recently-dead tuples, this caused a bug when using a recently-created
index: we might accept a recently-dead tuple that is an early member
of a broken HOT chain and does not actually match the index entry.
Then, the data extracted from the heap tuple would not necessarily be
an endpoint value of the column; it could even be NULL, leading to
get_actual_variable_range() itself reporting "found unexpected null
value in index". Even without an error, this could lead to poor
plan choices due to an erroneous notion of the endpoint value.
We can improve matters by changing the code to use the index-only
scan technique (which didn't exist when get_actual_variable_range was
originally written). If any of the tuples in a HOT chain are live
enough to satisfy SnapshotNonVacuumable, we take the data from the
index entry, ignoring what is in the heap. This fixes the problem
without changing the live-vs-dead-tuple behavior from what was
intended by commit 3ca930fc3.
A side benefit is that for static tables we might not have to touch
the heap at all (when the extremal value is in an all-visible page).
In addition, we can save some overhead by not having to create a
complete ExecutorState, and we don't need to run FormIndexDatum,
avoiding more cycles as well as the possibility of failure for
indexes on expressions. (I'm not sure that this code would ever
be used to determine the extreme value of an expression, in the
current state of the planner; but it's definitely possible that
lower-order columns of the selected index could be expressions.
So one could construct perhaps-artificial examples in which the
old code unexpectedly failed due to trying to compute an
expression's value for a now-dead row.)
Per report from Manuel Rigger. Back-patch to v11 where commit
3ca930fc3 came in.
Discussion: https://postgr.es/m/CA+u7OA7W4NWEhCvftdV6_8bbm2vgypi5nuxfnSEJQqVKFSUoMg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
match_clause_to_partition_key incorrectly would return
PARTCLAUSE_UNSUPPORTED if a bool qual could not be matched to the current
partition key. This was a problem, as it causes the calling function to
discard the qual and not try to match it to any other partition key. If
there was another partition key which did match this qual, then the qual
would not be checked again and we could fail to prune some partitions.
The worst this could do was to cause partitions not to be pruned when they
could have been, so there was no danger of incorrect query results here.
Fix this by changing match_boolean_partition_clause to have it return a
PartClauseMatchStatus rather than a boolean value. This allows it to
communicate if the qual is unsupported or if it just does not match this
particular partition key, previously these two cases were treated the
same. Now, if match_clause_to_partition_key is unable to match the qual
to any other qual type then we can simply return the value from the
match_boolean_partition_clause call so that the calling function properly
treats the qual as either unmatched or unsupported.
Reported-by: Rares Salcudean
Reviewed-by: Amit Langote
Backpatch-through: 11 where partition pruning was introduced
Discussion: https://postgr.es/m/CAHp_FN2xwEznH6oyS0hNTuUUZKp5PvegcVv=Co6nBXJ+mC7Y5w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can cause valgrind to complain, as the flag marking a buffer as a
temporary copy was not getting initialized.
While on it, fill in with zeros newly-created buffer pages. This does
not matter when loading a block from a temporary file, but it makes the
push of an index tuple into a new buffer page safer.
This has been introduced by 1d27dcf, so backpatch all the way down to
9.4.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With
pluggable storage, we can't make any assumptions about what the table AM
might want to do in its equivalent function, so we'd better ensure we
always call table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
This also changes things so that we no longer needlessly call
table_finish_bulk_insert when performing a COPY FROM for a non-partitioned
table when not using multi-inserts.
Reported-by: Robert Haas
Backpatch-through: 12
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise the executor can't see trigger transition tables during
EPQ evaluation. Fixes bug #15900 and almost certainly also #15720.
Back-patch to 10, where trigger transition tables landed.
Author: Alex Aktsipetrov
Reviewed-by: Thomas Munro, Tom Lane
Discussion: https://postgr.es/m/15900-bc482754fe8d7415%40postgresql.org
Discussion: https://postgr.es/m/15720-38c2b29e5d720187%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were creating the cloned triggers with an empty list of arguments,
losing the ones that had been specified by the user when creating the
trigger in the partitioned table. Repair.
This was forgotten in commit 86f575948c77.
Author: Patrick McHardy
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20190709130027.amr2cavjvo7rdvac@access1.trash.net
Discussion: https://postgr.es/m/15752-123bc90287986de4@postgresql.org
|
|
|
|
|
|
|
|
| |
Reported-by: Ashwin Agrawal
Author: Ashwin Agrawal
Reviewed-by: Amit Kapila
Backpatch-through: 12, where it was introduced
Discussion: https://postgr.es/m/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The itemlen variable used to be referenced in multiple places, but since
reworking the serialization code it's used only in one assert. Fixed by
removing the variable and calling the macro from the assert directly.
Backpatch to 12, where this code was introduced.
Reported-by: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1zc_ovH9NZd_9ovuiEWkF9yX06URUDdXCmgDydf-bqB5A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The serialization format of multivariate MCV lists included alignment in
order to allow direct access to part of the serialized data, but despite
multiple fixes (see for example commits d85e0f366a and ea4e1c0e8f) this
proved to be problematic.
This commit abandons alignment in the serialized format, and just copies
everything during deserialization. We now also track amount of memory
needed after deserialization (including alignment), which allows us to
deserialize the MCV list in a single pass.
Bump catversion, as this affects contents of pg_statistic_ext_data.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/2201.1561521148@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function pg_mcv_list_items() returns values stored in MCV items. The
items may contain columns with different data types, so the function was
generating text array-like representation, but in an ad-hoc way without
properly escaping various characters etc.
Fixed by simply building a text[] array, which also makes it easier to
use from queries etc.
Requires changes to pg_proc entry, so bump catversion.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building multi-column MCV lists, we compute base frequency for each
item, i.e. a product of per-column frequencies for values from the item.
As a value may be in multiple groups, the code was scanning the whole
array of groups while adding items to the MCV list. This works fine as
long as the number of distinct groups is small, but it's easy to trigger
trigger O(N^2) behavior, especially after increasing statistics target.
This commit precomputes frequencies for values in all columns, so that
when computing the base frequency it's enough to make a simple bsearch
lookup in the array.
Backpatch to 12, where multi-column MCV lists were introduced.
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
d4c3a156c added code to remove columns that were not part of a table's
PRIMARY KEY constraint from the GROUP BY clause when all the primary key
columns were present in the group by. This is fine to do since we know
that there will only be one row per group coming from this relation.
However, the logic failed to consider inheritance parent relations. These
can have child relations without a primary key, but even if they did, they
could duplicate one of the parent's rows or one from another child
relation. In this case, those additional GROUP BY columns are required.
Fix this by disabling the optimization for inheritance parent tables.
In v11 and beyond, partitioned tables are fine since partitions cannot
overlap and before v11 partitioned tables could not have a primary key.
Reported-by: Manuel Rigger
Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com
Backpatch-through: 9.6
|
|
|
|
|
| |
pgperltidy and reformat-dat-files too, though the latter didn't
find anything to change.
|
|
|
|
|
| |
This reverts commits 4de60244e and b2d69806d. Further thought is
required to make this work properly.
|
|
|
|
|
|
|
|
|
|
|
| |
4de60244e added the call to table_finish_bulk_insert to the
CopyMultiInsertBufferCleanup function. We use a CopyMultiInsertBuffer even
for non-partitioned tables, so having the cleanup do that meant we would
call table_finsh_bulk_insert twice when performing COPY FROM with
a non-partitioned table.
Here we can just remove the direct call in CopyFrom and let
CopyMultiInsertBufferCleanup handle the call instead.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With pluggable
storage, we can't make any assumptions about what the table AM might want
to do in its equivalent function, so we'd better ensure we always call
table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
In passing, move the table_finish_bulk_insert for the target of the COPY
command so that it's only called when we're actually performing bulk
inserts. We don't need to call this when inserting 1 row at a time.
Reported-by: Robert Haas
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
|
| |
|
|
|
|
|
| |
Author: Alexander Lakhin
Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
UBSan complains about this. Instead, cast to a suitable type requiring
only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast
between AnyArrayType and ArrayType, so this doesn't introduce a new
assumption. Back-patch to 9.5, where AnyArrayType was introduced.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20190629210334.GA1244217@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic in reorder_grouping_sets to order grouping set elements to
match a pre-specified sort ordering was defective, resulting in
unnecessary sort nodes (though the query output would still be
correct). Repair, simplifying the code a little, and add a test.
Per report from Richard Guo, though I didn't use their patch. Original
bug seems to have been my fault.
Backpatch back to 9.5 where grouping sets were introduced.
Discussion: https://postgr.es/m/CAN_9JTzyjGcUjiBHxLsgqfk7PkdLGXiM=pwM+=ph2LsWw0WO1A@mail.gmail.com
|
|
|
|
|
| |
Using PG_RETURN_LSN() from non-fmgr pg_lsn_in_internal() happened to
work on some platforms, but should just be a plain "return".
|