| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up to now async.c has used TransactionIdIsInProgress() to detect whether
a notify message's source transaction is still running. However, that
function has a quick-exit path that reports that XIDs before RecentXmin
are no longer running. If a listening backend is doing nothing but
listening, and not running any queries, there is nothing that will advance
its value of RecentXmin. Once 2 billion transactions elapse, the
RecentXmin check causes active transactions to be reported as not running.
If they aren't committed yet according to CLOG, async.c decides they
aborted and discards their messages. The timing for that is a bit tight
but it can happen when multiple backends are sending notifies concurrently.
The net symptom therefore is that a sufficiently-long-surviving
listen-only backend starts to miss some fraction of NOTIFY traffic,
but only under heavy load.
The only function that updates RecentXmin is GetSnapshotData().
A brute-force fix would therefore be to take a snapshot before
processing incoming notify messages. But that would add cycles,
as well as contention for the ProcArrayLock. We can be smarter:
having taken the snapshot, let's use that to check for running
XIDs, and not call TransactionIdIsInProgress() at all. In this
way we reduce the number of ProcArrayLock acquisitions from one
per message to one per notify interrupt; that's the same under
light load but should be a benefit under heavy load. Light testing
says that this change is a wash performance-wise for normal loads.
I looked around for other callers of TransactionIdIsInProgress()
that might be at similar risk, and didn't find any; all of them
are inside transactions that presumably have already taken a
snapshot.
Problem report and diagnosis by Marko Tiikkaja, patch by me.
Back-patch to all supported branches, since it's been like this
since 9.0.
Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples. But
concurrent transaction commit/abort may turn DEAD some of the HOT tuples
that survived the prune, before HeapTupleSatisfiesVacuum tests them.
This happens to activate the code that decides to freeze the tuple ...
which resuscitates it, duplicating data.
(This is especially bad if there's any unique constraints, because those
are now internally violated due to the duplicate entries, though you
won't know until you try to REINDEX or dump/restore the table.)
One possible fix would be to simply skip doing anything to the tuple,
and hope that the next HOT prune would remove it. But there is a
problem: if the tuple is older than freeze horizon, this would leave an
unfrozen XID behind, and if no HOT prune happens to clean it up before
the containing pg_clog segment is truncated away, it'd later cause an
error when the XID is looked up.
Fix the problem by having the tuple freezing routines cope with the
situation: don't freeze the tuple (and keep it dead). In the cases that
the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
so that there is no need to look up the XID in pg_clog later on.
An isolation test is included, authored by Michael Paquier, loosely
based on Daniel Wood's original reproducer. It only tests one
particular scenario, though, not all the possible ways for this problem
to surface; it be good to have a more reliable way to test this more
fully, but it'd require more work.
In message https://postgr.es/m/20170911140103.5akxptyrwgpc25bw@alvherre.pgsql
I outlined another test case (more closely matching Dan Wood's) that
exposed a few more ways for the problem to occur.
Backpatch all the way back to 9.3, where this problem was introduced by
multixact juggling. In branches 9.3 and 9.4, this includes a backpatch
of commit e5ff9fefcd50 (of 9.5 era), since the original is not
correctable without matching the coding pattern in 9.5 up.
Reported-by: Daniel Wood
Diagnosed-by: Daniel Wood
Reviewed-by: Yi Wen Wong, Michaël Paquier
Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com
|
|
|
|
|
|
|
| |
Per suggestions from Peter Eisentraut and David Johnston.
Back-patch, like the previous commit.
Discussion: https://postgr.es/m/E1dv9jI-0006oT-Fn@gemulon.postgresql.org
|
|
|
|
|
|
| |
In two cases, we set a different umask for some piece of code and
restore it afterwards. But if the contained code errors out, the umask
is not restored. So add TRY/CATCH blocks to fix that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the code didn't think about this case and would just try to
analyze such a column twice. That would fail at the point of inserting
the second version of the pg_statistic row, with obscure error messsages
like "duplicate key value violates unique constraint" or "tuple already
updated by self", depending on context and PG version. We could allow
the case by ignoring duplicate column specifications, but it seems better
to reject it explicitly.
The bogus error messages seem like arguably a bug, so back-patch to
all supported versions.
Nathan Bossart, per a report from Michael Paquier, and whacked
around a bit by me.
Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AfterTriggerEndQuery correctly notes that the query_stack could get
repalloc'd during a trigger firing, but it nonetheless passes the address
of a query_stack entry to afterTriggerInvokeEvents, so that if such a
repalloc occurs, afterTriggerInvokeEvents is already working with an
obsolete dangling pointer while it scans the rest of the events. Oops.
The only code at risk is its "delete_ok" cleanup code, so we can
prevent unsafe behavior by passing delete_ok = false instead of true.
However, that could have a significant performance penalty, because the
point of passing delete_ok = true is to not have to re-scan possibly
a large number of dead trigger events on the next time through the loop.
There's more than one way to skin that cat, though. What we can do is
delete all the "chunks" in the event list except the last one, since
we know all events in them must be dead. Deleting the chunks is work
we'd have had to do later in AfterTriggerEndQuery anyway, and it ends
up saving rescanning of just about the same events we'd have gotten
rid of with delete_ok = true.
In v10 and HEAD, we also have to be careful to mop up any per-table
after_trig_events pointers that would become dangling. This is slightly
annoying, but I don't think that normal use-cases will traverse this code
path often enough for it to be a performance problem.
It's pretty hard to hit this in practice because of the unlikelihood
of the query_stack getting resized at just the wrong time. Nonetheless,
it's definitely a live bug of ancient standing, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/2891.1505419542@sss.pgh.pa.us
|
|
|
|
|
| |
This was erroneously removed in
55a70a023c3daefca9bbd68bfbe6862af10ab479.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
find_composite_type_dependencies correctly found columns that are of
the specified type, and columns that are of arrays of that type, but
not columns that are domains or ranges over the given type, its array
type, etc. The most general way to handle this seems to be to assume
that any type that is directly dependent on the specified type can be
treated as a container type, and processed recursively (allowing us
to handle nested cases such as ranges over domains over arrays ...).
Since a type's array type already has such a dependency, we can drop
the existing special case for the array type.
The very similar logic in get_rels_with_domain was likewise a few
bricks shy of a load, as it supposed that a directly dependent type
could *only* be a sub-domain. This is already wrong for ranges over
domains, and it'll someday be wrong for arrays over domains.
Add test cases illustrating the problems, and back-patch to all
supported branches.
Discussion: https://postgr.es/m/15268.1502309024@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
Commit fd31cd265138 renamed the variable to skipping_blocks, but forgot
to update this comment.
Noticed while inspecting code.
|
|
|
|
| |
Noticed while reviewing code.
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Fixes bug #14706, reported by Karen Huddleston.
Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org
|
|
|
|
|
| |
Mark our rusage reportage string translatable; remove quotes from type
names; unify formatting of very similar messages.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously this was not allowed, as copy.c didn't set the
CURSOR_OPT_PARALLEL_OK flag when planning the query. Set it.
While the lack of parallel query for COPY isn't strictly speaking a
bug, it does prevent parallelism from being used in a facility
commonly used to run long running queries. Thus backpatch to 9.6.
Author: Andres Freund
Discussion: https://postgr.es/m/20170531231958.ihanapplorptykzm@alap3.anarazel.de
Backpatch: 9.6, where parallelism was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CommentStmt made by RebuildConstraintComment() has to pstrdup the
relation name, else it will contain a dangling pointer after that
relcache entry is flushed. (I'm less sure that pstrdup'ing conname
is necessary, but let's be safe.) Failure to do this leads to weird
errors or crashes, as reported by Marko Elezovic.
Bug introduced by commit e42375fc8, so back-patch to 9.5 as that was.
Fix by David Rowley, regression test by Michael Paquier
Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to a missing CommandCounterIncrement() call, parsing of a non-utility
command in an extension script would not see the effects of the immediately
preceding DDL command, unless that command's execution ends with
CommandCounterIncrement() internally ... which some do but many don't.
Report by Philippe Beaudoin, diagnosis by Julien Rouhaud.
Rather remarkably, this bug has evaded detection since extensions were
invented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
|
|
|
|
|
|
|
|
|
|
| |
Currently, trying to validate a NO INHERIT constraint on the parent will
search for the constraint in child tables (where it is not supposed to
exist), wrongly causing a "constraint does not exist" error.
Amit Langote, per a report from Hans Buschmann.
Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
presence of page pins, which leads to serious estimation errors in the
planner. This particularly affects small heavily-accessed tables,
especially where locking (e.g. from FK constraints) forces frequent
vacuums for mxid cleanup.
Fix by keeping separate track of pages whose live tuples were actually
counted vs. pages that were only scanned for freezing purposes. Thus,
reltuples can only be set to 0 if all pages of the relation were
actually counted.
Backpatch to all supported versions.
Per bug #14057 from Nicolas Baccelli, analyzed by me.
Discussion: https://postgr.es/m/20160331103739.8956.94469@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE,
and would sometimes fail without that, because it used the relation name
directly from the relcache as part of the parsetree it's building. That
becomes a potentially-dangling pointer as soon as the relcache entry is
closed, a bit further down. Typical symptom if the relcache entry chanced
to get cleared would be "relation does not exist" error with a garbage
relation name, or possibly a core dump; but if you were really truly
unlucky, the COPY might copy from the wrong table.
Per report from Andrew Dunstan that regression tests fail with
-DRELCACHE_FORCE_RELEASE. The core tests now pass for me (but have
not tried "make check-world" yet).
Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by
Andrew Dunstan, and could sometimes fail in normal operation, resulting
in a wrong persistence value being used for the transient table.
It's not immediately clear to me what effects that might have beyond
the risk of a crash while accessing OldHeap->rd_rel->relpersistence,
but it's probably not good.
Bug introduced by commit f41872d0c, and made substantially worse by
commit 85b506bbf, which added a second such access significantly
later than the heap_close. I doubt the first reference could fail
in a production scenario, but the second one definitely could.
Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
|
|
|
|
|
|
|
|
|
|
|
| |
The ALTER TABLE ALTER TYPE implementation can issue DROP INDEX and
CREATE INDEX to refit existing indexes for the new column type. Since
this CREATE INDEX is an implementation detail of an index alteration,
the ensuing DefineIndex() should skip ACL checks specific to index
creation. It already skips the namespace ACL check. Make it skip the
tablespace ACL check, too. Back-patch to 9.2 (all supported versions).
Reviewed by Tom Lane.
|
|
|
|
|
|
|
|
|
| |
Backpatch to all supported versions, where applicable, to make backpatching
of future fixes go more smoothly.
Josh Soref
Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 6c268df, pg_init_privs was added to track the initial
privileges of catalog objects and extensions. Unfortunately, that
commit didn't include understanding of ALTER EXTENSION ADD/DROP, which
allows the objects associated with an extension to be changed after the
initial CREATE EXTENSION script has been run.
The result of this meant that ACLs for objects added through
ALTER EXTENSION ADD were not recorded into pg_init_privs and we would
end up including those ACLs in pg_dump when we shouldn't have.
This commit corrects that by making sure to have pg_init_privs updated
when ALTER EXTENSION ADD/DROP is run, recording the permissions as they
are at ALTER EXTENSION ADD time, and removing any if/when ALTER
EXTENSION DROP is called.
This issue was pointed out by Moshe Jacobson as commentary on bug #14456
(which was actually a bug about versions prior to 9.6 not handling
custom ACLs on extensions correctly, an issue now addressed with
pg_init_privs in 9.6).
Back-patch to 9.6 where pg_init_privs was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If inherited tables don't have exactly the same schema, the USING clause
in an ALTER TABLE / SET DATA TYPE misbehaves when applied to the
children tables since commit 9550e8348b79. Starting with that commit,
the attribute numbers in the USING expression are fixed during parse
analysis. This can lead to bogus errors being reported during
execution, such as:
ERROR: attribute 2 has wrong type
DETAIL: Table has type smallint, but query expects integer.
Since it wouldn't do to revert to the original coding, we now apply a
transformation to map the attribute numbers to the correct ones for each
child.
Reported by Justin Pryzby
Analysis by Tom Lane; patch by me.
Discussion: https://postgr.es/m/20170102225618.GA10071@telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes problems where a plan must change but fails to do so,
as seen in a bug report from Rajkumar Raghuwanshi.
For ALTER FOREIGN TABLE OPTIONS, do this through the standard method of
forcing a relcache flush on the table. For ALTER FOREIGN DATA WRAPPER
and ALTER SERVER, just flush the whole plan cache on any change in
pg_foreign_data_wrapper or pg_foreign_server. That matches the way
we handle some other low-probability cases such as opclass changes, and
it's unclear that the case arises often enough to be worth working harder.
Besides, that gives a patch that is simple enough to back-patch with
confidence.
Back-patch to 9.3. In principle we could apply the code change to 9.2 as
well, but (a) we lack postgres_fdw to test it with, (b) it's doubtful that
anyone is doing anything exciting enough with FDWs that far back to need
this desperately, and (c) the patch doesn't apply cleanly.
Patch originally by Amit Langote, reviewed by Etsuro Fujita and Ashutosh
Bapat, who each contributed substantial changes as well.
Discussion: https://postgr.es/m/CAKcux6m5cA6rRPTKkqVdJ-R=KKDfe35Q_ZuUqxDSV_4hwga=og@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inheritance operations must treat the OID column, if any, much like
regular user columns. But MergeAttributesIntoExisting() neglected to
do that, leading to weird results after a table with OIDs is associated
to a parent with OIDs via ALTER TABLE ... INHERIT.
Report and patch by Amit Langote, reviewed by Ashutosh Bapat, some
adjustments by me. It's been broken all along, so back-patch to
all supported branches.
Discussion: https://postgr.es/m/cb13cfe7-a48c-5720-c383-bb843ab28298@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we are altering a text search configuration, we are getting the
tuple from pg_ts_config and using its OID, so use TSConfigRelationId
when invoking any post-alter hooks and setting the object address.
Further, in the functions called from AlterTSConfiguration(), we're
saving information about the command via
EventTriggerCollectAlterTSConfig(), so we should be setting
commandCollected to true. Also add a regression test to
test_ddl_deparse for ALTER TEXT SEARCH CONFIGURATION.
Author: Artur Zakirov, a few additional comments by me
Discussion: https://www.postgresql.org/message-id/57a71eba-f2c7-e7fd-6fc0-2126ec0b39bd%40postgrespro.ru
Back-patch the fix for the InvokeObjectPostAlterHook() call to 9.3 where
it was introduced, and the fix for the ObjectAddressSet() call and
setting commandCollected to true to 9.5 where those changes to
ProcessUtilitySlow() were introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When CREATE OR REPLACE VIEW acts on an existing view, don't update the
view options until after the view query has been updated.
This is necessary in the case where CREATE OR REPLACE VIEW is used on
an existing view that is not updatable, and the new view is updatable
and specifies the WITH CHECK OPTION. In this case, attempting to apply
the new options to the view before updating its query fails, because
the options are applied using the ALTER TABLE infrastructure which
checks that WITH CHECK OPTION is only applied to an updatable view.
If new columns are being added to the view, that is also done using
the ALTER TABLE infrastructure, but it is important that that still be
done before updating the view query, because the rules system checks
that the query columns match those on the view relation. Added a
comment to explain that, in case someone is tempted to move that to
where the view options are now being set.
Back-patch to 9.4 where WITH CHECK OPTION was added.
Report: https://postgr.es/m/CAEZATCUp%3Dz%3Ds4SzZjr14bfct_bdJNwMPi-gFi3Xc5k1ntbsAgQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a prepared statement created via a Parse message could get
a parallel plan, but one created with a PREPARE statement could not.
This state of affairs was due to confusion on my (rhaas) part: I
erroneously believed that a CREATE TABLE .. AS EXECUTE statement could
only be performed with a prepared statement by PREPARE, but in fact
one created by a Prepare message works just as well. Therefore, it
makes no sense to allow parallel query in one case but not the other.
To fix, allow parallel query with all prepared statements, but run
the parallel plan serially (i.e. without workers) in the case of
CREATE TABLE .. AS EXECUTE. Also, document this.
Amit Kapila and Tobias Bussman, plus an extra sentence of
documentation by me.
|
|
|
|
|
|
| |
This reverts commit 4e01ecae98275298c680c92fdba62daf603dc98e.
Per Tom Lane, changing the definition of StringInfoData amounts to an
ABI break, which is unacceptable in back branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our documentation states that our maximum field size is 1 GB, and that
our maximum row size of 1.6 TB. However, while this might be attainable
in theory with enough contortions, it is not workable in practice; for
starters, pg_dump fails to dump tables containing rows larger than 1 GB,
even if individual columns are well below the limit; and even if one
does manage to manufacture a dump file containing a row that large, the
server refuses to load it anyway.
This commit enables dumping and reloading of such tuples, provided two
conditions are met:
1. no single column is larger than 1 GB (in output size -- for bytea
this includes the formatting overhead)
2. the whole row is not larger than 2 GB
There are three related changes to enable this:
a. StringInfo's API now has two additional functions that allow creating
a string that grows beyond the typical 1GB limit (and "long" string).
ABI compatibility is maintained. We still limit these strings to 2 GB,
though, for reasons explained below.
b. COPY now uses long StringInfos, so that pg_dump doesn't choke
trying to emit rows longer than 1GB.
c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation
for the input tuple, which means that large tuples are accepted on
input. Note that at this point we do not apply any further limit to the
input tuple size.
The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit
length words to describe each row; and because the documentation is
ambiguous on its signedness and libpq does consider it signed, we cannot
use the highest-order bit. Additionally, the StringInfo API uses "int"
(which is 4 bytes wide in most platforms) in many places, so we'd need
to change that API too in order to improve, which has lots of fallout.
Backpatch to 9.5, which is the oldest that has
MemoryContextAllocExtended, a necessary piece of infrastructure. We
could apply to 9.4 with very minimal additional effort, but any further
than that would require backpatching "huge" allocations too.
This is the largest set of changes we could find that can be
back-patched without breaking compatibility with existing systems.
Fixing a bigger set of problems (for example, dumping tuples bigger than
2GB, or dumping fields bigger than 1GB) would require changing the FE/BE
protocol and/or changing the StringInfo API in an ABI-incompatible way,
neither of which would be back-patchable.
Authors: Daniel Vérité, Álvaro Herrera
Reviewed by: Tomas Vondra
Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dropping a foreign key constraint with ALTER TABLE DROP CONSTRAINT,
we refuse the drop if there are any pending trigger events on the named
table; this ensures that we won't remove the pg_trigger row that will be
consulted by those events. But we should make the same check for the
referenced relation, else we might remove a due-to-be-referenced pg_trigger
row for that relation too, resulting in "could not find trigger NNN" or
"relation NNN has no triggers" errors at commit. Per bug #14431 from
Benjie Gillam. Back-patch to all supported branches.
Report: <20161124114911.6530.31200@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code to change the deferrability properties of a foreign-key constraint
updated all the associated triggers to match; but a moment's examination of
the code that creates those triggers in the first place shows that only
some of them should track the constraint's deferrability properties. This
leads to odd failures in subsequent exercise of the foreign key, as the
triggers are fired at the wrong times. Fix that, and add a regression test
comparing the trigger properties produced by ALTER CONSTRAINT with those
you get by creating the constraint as-intended to begin with.
Per report from James Parks. Back-patch to 9.4 where this ALTER
functionality was introduced.
Report: <CAJ3Xv+jzJ8iNNUcp4RKW8b6Qp1xVAxHwSXVpjBNygjKxcVuE9w@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With track_io_timing = on, EXPLAIN (ANALYZE, BUFFERS) will emit fields
named like "I/O Read Time". The slash makes that invalid as an XML
element name, so that adding FORMAT XML would produce invalid XML.
We already have code in there to translate spaces to dashes, so let's
generalize that to convert anything that isn't a valid XML name character,
viz letters, digits, hyphens, underscores, and periods. We could just
reject slashes, which would run a bit faster. But the fact that this went
unnoticed for so long doesn't give me a warm feeling that we'd notice the
next creative violation, so let's make it a permanent fix.
Reported by Markus Winand, though this isn't his initial patch proposal.
Back-patch to 9.2 where track_io_timing was added. The problem is only
latent in 9.1, so I don't feel a need to fix it there.
Discussion: <E0BF6A45-68E8-45E6-918F-741FB332C6BB@winand.at>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Attempting to COPY a subset of columns from a table with RLS enabled
would fail due to an invalid query being constructed (using a single
ColumnRef with the list of fields to exact in 'fields', but that's for
the different levels of an indirection for a single column, not for
specifying multiple columns).
Correct by building a ColumnRef and then RestTarget for each column
being requested and then adding those to the targetList for the select
query. Include regression tests to hopefully catch if this is broken
again in the future.
Patch-By: Adam Brightwell
Reviewed-By: Michael Paquier
|
|
|
|
|
|
|
|
|
|
| |
lazy_truncate_heap() was waiting for
VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL, but in microseconds
not milliseconds as originally intended.
Found by code inspection.
Simon Riggs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters. While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea. Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.
While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.
In passing, change TopMemoryContext to use the default allocation
parameters. Formerly it could only be extended 8K at a time. That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there. There seems no good reason
not to let it use growing blocks like most other contexts.
Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain. The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.
Discussion: <21072.1472321324@sss.pgh.pa.us>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per discussion, we should provide such functions to replace the lost
ability to discover AM properties by inspecting pg_am (cf commit
65c5fcd35). The added functionality is also meant to displace any code
that was looking directly at pg_index.indoption, since we'd rather not
believe that the bit meanings in that field are part of any client API
contract.
As future-proofing, define the SQL API to not assume that properties that
are currently AM-wide or index-wide will remain so unless they logically
must be; instead, expose them only when inquiring about a specific index
or even specific index column. Also provide the ability for an index
AM to override the behavior.
In passing, document pg_am.amtype, overlooked in commit 473b93287.
Andrew Gierth, with kibitzing by me and others
Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EXPLAIN (ANALYZE, TIMING OFF) would print an elapsed time of zero for
a trigger function, because no measurement has been taken but it printed
the field anyway. This isn't what EXPLAIN does elsewhere, so suppress it.
In the same vein, EXPLAIN (ANALYZE, BUFFERS) with non-text output format
would print buffer I/O timing numbers even when no measurement has been
taken because track_io_timing is off. That seems not per policy, either,
so change it.
Back-patch to 9.2 where these features were introduced.
Maksim Milyutin
Discussion: <081c0540-ecaa-bd29-3fd2-6358f3b359a9@postgrespro.ru>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If ANALYZE found no repeated non-null entries in its sample, it set the
column's stadistinct value to -1.0, intending to indicate that the entries
are all distinct. But what this value actually means is that the number
of distinct values is 100% of the table's rowcount, and thus it was
overestimating the number of distinct values by however many nulls there
are. This could lead to very poor selectivity estimates, as for example
in a recent report from Andreas Joseph Krogh. We should discount the
stadistinct value by whatever we've estimated the nulls fraction to be.
(That is what will happen if we choose to use a negative stadistinct for
a column that does have repeated entries, so this code path was just
inconsistent.)
In addition to fixing the stadistinct entries stored by several different
ANALYZE code paths, adjust the logic where get_variable_numdistinct()
forces an "all distinct" estimate on the basis of finding a relevant unique
index. Unique indexes don't reject nulls, so there's no reason to assume
that the null fraction doesn't apply.
Back-patch to all supported branches. Back-patching is a bit of a judgment
call, but this problem seems to affect only a few users (else we'd have
identified it long ago), and it's bad enough when it does happen that
destabilizing plan choices in a worse direction seems unlikely.
Patch by me, with documentation wording suggested by Dean Rasheed
Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena>
Discussion: <16143.1470350371@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Discussion of commit 3e2f3c2e4 exposed a problem that is of longer
standing: since we don't detoast data while sticking it into a portal's
holdStore for PORTAL_ONE_RETURNING and PORTAL_UTIL_SELECT queries, and we
release the query's snapshot as soon as we're done loading the holdStore,
later readout of the holdStore can do TOAST fetches against data that can
no longer be seen by any of the session's live snapshots. This means that
a concurrent VACUUM could remove the TOAST data before we can fetch it.
Commit 3e2f3c2e4 exposed the problem by showing that sometimes we had *no*
live snapshots while fetching TOAST data, but we'd be at risk anyway.
I believe this code was all right when written, because our management of a
session's exposed xmin was such that the TOAST references were safe until
end of transaction. But that's no longer true now that we can advance or
clear our PGXACT.xmin intra-transaction.
To fix, copy the query's snapshot during FillPortalStore() and save it in
the Portal; release it only when the portal is dropped. This essentially
implements a policy that we must hold a relevant snapshot whenever we
access potentially-toasted data. We had already come to that conclusion
in other places, cf commits 08e261cbc94ce9a7 and ec543db77b6b72f2.
I'd have liked to add a regression test case for this, but I didn't see
a way to make one that's not unreasonably bloated; it seems to require
returning a toasted value to the client, and those will be big.
In passing, improve PortalRunUtility() so that it positively verifies
that its ending PopActiveSnapshot() call will pop the expected snapshot,
removing a rather shaky assumption about which utility commands might
do their own PopActiveSnapshot(). There's no known bug here, but now
that we're actively referencing the snapshot it's almost free to make
this code a bit more bulletproof.
We might want to consider back-patching something like this into older
branches, but it would be prudent to let it prove itself more in HEAD
beforehand.
Discussion: <87vazemeda.fsf@credativ.de>
|
| |
|
|
|
|
| |
Omission noted by Andres Freund.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since a892234 & fd31cd265 the visibilitymap's freeze bit is used to
avoid vacuuming the whole relation in anti-wraparound vacuums. Doing so
correctly relies on not adding xids to the heap without also unsetting
the visibilitymap flag. Tuple locking related code has not done so.
To allow selectively resetting all-frozen - to avoid pessimizing
heap_lock_tuple - allow to selectively reset the all-frozen with
visibilitymap_clear(). To avoid having to use
visibilitymap_get_status (e.g. via VM_ALL_FROZEN) inside a critical
section, have visibilitymap_clear() return whether any bits have been
reset.
There's a remaining issue (denoted by XXX): After the PageIsAllVisible()
check in heap_lock_tuple() and heap_lock_updated_tuple_rec() the page
status could theoretically change. Practically that currently seems
impossible, because updaters will hold a page level pin already. Due to
the next beta coming up, it seems better to get the required WAL magic
bump done before resolving this issue.
The added flags field fields to xl_heap_lock and xl_heap_lock_updated
require bumping the WAL magic. Since there's already been a catversion
bump since the last beta, that's not an issue.
Reviewed-By: Robert Haas, Amit Kapila and Andres Freund
Author: Masahiko Sawada, heavily revised by Andres Freund
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: -
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have, for a very long time, allowed the same subplan (same member of the
PlannedStmt.subplans list) to be referenced by more than one SubPlan node;
this avoids problems for cases such as subplans within an IndexScan's
indxqual and indxqualorig fields. However, EXPLAIN had not gotten the memo
and would print each reference as though it were an independent identical
subplan. To fix, track plan_ids of subplans we've printed and don't print
the same plan_id twice. Per report from Pavel Stehule.
BTW: the particular case of IndexScan didn't cause visible duplication
in a plain EXPLAIN, only EXPLAIN ANALYZE, because in the former case we
short-circuit executor startup before the indxqual field is processed by
ExecInitExpr. That seems like it could easily lead to other EXPLAIN
problems in future, but it's not clear how to avoid it without breaking
the "EXPLAIN a plan using hypothetical indexes" use-case. For now I've
left that issue alone.
Although this is a longstanding bug, it's purely cosmetic (no great harm
is done by the repeat printout) and we haven't had field complaints before.
So I'm hesitant to back-patch it, especially since there is some small risk
of ABI problems due to the need to add a new field to ExplainState.
In passing, rearrange order of fields in ExplainState to be less random,
and update some obsolete comments about when/where to initialize them.
Report: <CAFj8pRAimq+NK-menjt+3J4-LFoodDD8Or6=Lc_stcFD+eD4DA@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, workers sent data to the leader using the client encoding.
That mostly worked, but the leader the converted the data back to the
server encoding. Since not all encoding conversions are reversible,
that could provoke failures. Fix by using the database encoding for
all communication between worker and leader.
Also, while temporary changes to GUC settings, as from the SET clause
of a function, are in general OK for parallel query, changing
client_encoding this way inside of a parallel worker is not OK.
Previously, that would have confused the leader; with these changes,
it would not confuse the leader, but it wouldn't do anything either.
So refuse such changes in parallel workers.
Also, the previous code naively assumed that when it received a
NotifyResonse from the worker, it could pass that directly back to the
user. But now that worker-to-leader communication always uses the
database encoding, that's clearly no longer correct - though,
actually, the old way was always broken for V2 clients. So
disassemble and reconstitute the message instead.
Issues reported by Peter Eisentraut. Patch by me, reviewed by
Peter Eisentraut.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In non-text output formats, parallelized aggregates were reporting
"Partial" or "Finalize" as a field named "Operation", which might be all
right in the absence of any context --- but other plan node types use that
field to report SQL-visible semantics, such as Select/Insert/Update/Delete.
So that naming choice didn't seem good to me. I changed it to "Partial
Mode".
Also, the field did not appear at all for a non-parallelized Agg plan node,
which is contrary to expectation in non-text formats. We're notionally
producing objects that conform to a schema, so the set of fields for a
given node type and EXPLAIN mode should be well-defined. I set it up to
fill in "Simple" in such cases.
Other fields that were added for parallel query, namely "Parallel Aware"
and Gather's "Single Copy", had not gotten the word on that point either.
Make them appear always in non-text output.
Also, the latter two fields were nominally producing boolean output, but
were getting it wrong, because bool values shouldn't be quoted in JSON or
YAML. Somehow we'd not needed an ExplainPropertyBool formatting subroutine
before 9.6; but now we do, so invent it.
Discussion: <16002.1466972724@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, these commands always planned the given query and went through
executor startup before deciding not to actually run the query if WITH NO
DATA is specified. This behavior is problematic for pg_dump because it
may cause errors to be raised that we would rather not see before a
REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907
from Marian Krucina. This change is not sufficient to fix that particular
bug, because we also need to tweak pg_dump to issue the REFRESH later,
but it's a necessary step on the way.
A user-visible side effect of doing things this way is that the returned
command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW"
or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior
but it would take more code, and arguably that was just an implementation
artifact not intended behavior anyhow.
In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which
was trouble waiting to happen; there is not any prohibition on nested
CREATE commands.
Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced.
Michael Paquier and Tom Lane
Report: <20160202161407.2778.24659@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original coding had three separate booleans representing partial
aggregation behavior, which was confusing, unreadable, and error-prone,
not least because the booleans weren't always listed in the same order.
It was also inadequate for the allegedly-desirable future extension to
support intermediate partial aggregation, because we'd need separate
markers for serialization and deserialization in such a case.
Merge these bools into an enum "AggSplit" to provide symbolic names for
the supported operating modes (and document what those are). By assigning
the values of the enum constants carefully, we can treat AggSplit values
as options bitmasks so that tests of what to do aren't noticeably more
expensive than before.
While at it, get rid of Aggref.aggoutputtype. That's not needed since
commit 59a3795c2 got rid of setrefs.c's special-purpose Aggref comparison
code, and it likewise seemed more confusing than helpful.
Assorted comment cleanup as well (there's still more that I want to do
in that line).
catversion bump for change in Aggref node contents. Should be the last
one for partial-aggregation changes.
Discussion: <29309.1466699160@sss.pgh.pa.us>
|