| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When promoting a standby just after a XLOG_SWITCH record was replayed,
and next segment(s) are already are locally available (via walsender,
restore_command + trigger/recovery target), that segment could
accidentally be recycled onto the past of the new timeline. Later
checkpointer would create a .ready file for it, assuming there was an
error during creation, and it would get archived. That causes trouble
if another standby is later brought up from a basebackup from before
the timeline creation, because it would try to read the
segment, because XLogFileReadAnyTLI just tries all possible timelines,
which doesn't have valid contents. Thus replay would fail.
The problem, if already occurred, can be fixed by removing the segment
and/or having restore_command filter it out.
The reason for the creation of such "phantom" segments was, that after
an XLOG_SWITCH record the EndOfLog variable points to the beginning of
the next segment, and RemoveXlogFile() used XLByteToPrevSeg().
Normally RemoveXlogFile() doing so is harmless, because the last
segment will still exist preventing InstallXLogFileSegment() from
causing harm, but just after promotion there's no previous segment on
the new timeline.
Fix that by using XLByteToSeg() instead of XLByteToPrevSeg().
Author: Andres Freund
Reported-By: Greg Burek
Discussion: https://postgr.es/m/20170619073026.zcwpe6mydsaz5ygd@alap3.anarazel.de
Backpatch: 9.2-, bug older than all supported versions
|
|
|
|
| |
Etsuro Fujita
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.
One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join. But we should not
really need to special-case those here. eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead. Let's just estimate such cases like inner joins and rely on that
later adjustment. (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.) Add a
regression test exercising such a case to show that this is sane in
at least some cases.
The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS. In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity. To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation. (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)
Report and patch by David Rowley, simplified a bit by me. Back-patch
to 9.6 where this code was added.
Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When, during logical decoding, a transaction gets too big, it's
contents get spilled to disk. Not just the top-transaction gets
spilled, but *also* all of its subtransactions, even if they're not
that large themselves. Unfortunately we didn't clean up
such small spilled subtransactions from disk.
Fix that, by keeping better track of whether a transaction has been
spilled to disk.
Author: Andres Freund
Reported-By: Dmitriy Sarafannikov, Fabrízio de Royes Mello
Discussion:
https://postgr.es/m/1457621358.355011041@f382.i.mail.ru
https://postgr.es/m/CAFcNs+qNMhNYii4nxpO6gqsndiyxNDYV0S=JNq0v_sEE+9PHXg@mail.gmail.com
Backpatch: 9.4-, where logical decoding was introduced
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a new base type is created using the old-style procedure of first
creating the input/output functions with "opaque" in place of the base
type, the "opaque" argument/return type is changed to the final base type,
on CREATE TYPE. However, we did not create a pg_depend record when doing
that, so the functions were left not depending on the type.
Fixes bug #14706, reported by Karen Huddleston.
Discussion: https://www.postgresql.org/message-id/20170614232259.1424.82774@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had three occurrences of essentially the same coding pattern
wherein we tried to retrieve a query result from a libpq connection
without blocking. In the case where PQconsumeInput failed (typically
indicating a lost connection), all three loops simply gave up and
returned, forgetting to clear any previously-collected PGresult
object. Since those are malloc'd not palloc'd, the oversight results
in a process-lifespan memory leak.
One instance, in libpqwalreceiver, is of little significance because
the walreceiver process would just quit anyway if its connection fails.
But we might as well fix it.
The other two instances, in postgres_fdw, are somewhat more worrisome
because at least in principle the scenario could be repeated, allowing
the amount of memory leaked to build up to something worth worrying
about. Moreover, in these cases the loops contain CHECK_FOR_INTERRUPTS
calls, as well as other calls that could potentially elog(ERROR),
providing another way to exit without having cleared the PGresult.
Here we need to add PG_TRY logic similar to what exists in quite a
few other places in postgres_fdw.
Coverity noted the libpqwalreceiver bug; I found the other two cases
by checking all calls of PQconsumeInput.
Back-patch to all supported versions as appropriate (9.2 lacks
postgres_fdw, so this is really quite unexciting for that branch).
Discussion: https://postgr.es/m/22620.1497486981@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During pg_upgrade's restore run, all relfilenode choices should be
overridden by commands in the dump script. If we ever find ourselves
choosing a relfilenode in the ordinary way, someone blew it. Likewise for
pg_type OIDs. Since pg_upgrade might well succeed anyway, if there happens
not to be a conflict during the regression test run, we need assertions
here to keep us on the straight and narrow.
We might someday be able to remove the assertion in GetNewRelFileNode,
if pg_upgrade is rewritten to remove its assumption that old and new
relfilenodes always match. But it's hard to see how to get rid of the
pg_type OID constraint, since those OIDs are embedded in user tables
in some cases.
Back-patch as far as 9.5, because of the risk of back-patches breaking
something here even if it works in HEAD. I'd prefer to go back further,
but 9.4 fails both assertions due to get_rel_infos()'s use of a temporary
table. We can't use the later-branch solution of a CTE for compatibility
reasons (cf commit 5d16332e9), and it doesn't seem worth inventing some
other way to do the query. (I did check, by dint of changing the Asserts
to elog(WARNING), that there are no other cases of unwanted OID assignments
during 9.4's regression test run.)
Discussion: https://postgr.es/m/19785.1497215827@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because walsender and normal backends share the same main loop it's
problematic to have two different flag variables, set in signal
handlers, indicating a pending configuration reload. Only certain
walsender commands reach code paths checking for the
variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT
... LOGICAL, notably not base backups).
This is a bug present since the introduction of walsender, but has
gotten worse in releases since then which allow walsender to do more.
A later patch, not slated for v10, will similarly unify SIGHUP
handling in other types of processes as well.
Author: Petr Jelinek, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4nu7v@alap3.anarazel.de
Backpatch: 9.2-, bug is present since 9.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the checkpointer writes the shutdown checkpoint, it checks
afterwards whether any WAL has been written since it started and
throws a PANIC if so. At that point, only walsenders are still
active, so one might think this could not happen, but walsenders can
also generate WAL, for instance in BASE_BACKUP and logical decoding
related commands (e.g. via hint bits). So they can trigger this panic
if such a command is run while the shutdown checkpoint is being
written.
To fix this, divide the walsender shutdown into two phases. First,
checkpointer, itself triggered by postmaster, sends a
PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend
is idle or runs an SQL query this causes the backend to shutdown, if
logical replication is in progress all existing WAL records are
processed followed by a shutdown. Otherwise this causes the walsender
to switch to the "stopping" state. In this state, the walsender will
reject any further replication commands. The checkpointer begins the
shutdown checkpoint once all walsenders are confirmed as
stopping. When the shutdown checkpoint finishes, the postmaster sends
us SIGUSR2. This instructs walsender to send any outstanding WAL,
including the shutdown checkpoint record, wait for it to be replicated
to the standby, and then exit.
Author: Andres Freund, based on an earlier patch by Michael Paquier
Reported-By: Fujii Masao, Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxpvs2@alap3.anarazel.de
Backpatch: 9.4, where logical decoding was introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The non-participation in procsignal was a problem for both changes in
master, e.g. parallelism not working for normal statements run in
walsender backends, and older branches, e.g. recovery conflicts and
catchup interrupts not working for logical decoding walsenders.
This commit thus replaces the previous WalSndXLogSendHandler with
procsignal_sigusr1_handler. In branches since db0f6cad48 that can
lead to additional SetLatch calls, but that only rarely seems to make
a difference.
Author: Andres Freund
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nckrow@alap3.anarazel.de
Backpatch: 9.4, earlier commits don't seem to benefit sufficiently
|
|
|
|
|
| |
Mark our rusage reportage string translatable; remove quotes from type
names; unify formatting of very similar messages.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously this was not allowed, as copy.c didn't set the
CURSOR_OPT_PARALLEL_OK flag when planning the query. Set it.
While the lack of parallel query for COPY isn't strictly speaking a
bug, it does prevent parallelism from being used in a facility
commonly used to run long running queries. Thus backpatch to 9.6.
Author: Andres Freund
Discussion: https://postgr.es/m/20170531231958.ihanapplorptykzm@alap3.anarazel.de
Backpatch: 9.6, where parallelism was introduced.
|
|
|
|
|
|
|
|
| |
Remove some gratuituous message differences by making the AM name
previously embedded in each message be a %s instead. While at it, get
rid of terminology that's unclear and unnecessary in one message.
Discussion: https://postgr.es/m/20170523001557.bq2hbq7hxyvyw62q@alvherre.pgsql
|
|
|
|
|
|
| |
Back-patch of commit 8b0b6303e991079726e83d17401405e94da11564.
Discussion: https://postgr.es/m/22173.1494788088@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NumericOnly grammar production accepted ICONST, + ICONST, - ICONST,
FCONST, and - FCONST, but for some reason not + FCONST. This led to
strange inconsistencies like
regression=# set random_page_cost = +4;
SET
regression=# set random_page_cost = 4000000000;
SET
regression=# set random_page_cost = +4000000000;
ERROR: syntax error at or near "4000000000"
(because 4000000000 is too large to be an ICONST). While there's
no actual functional reason to need to write a "+", if we allow
it for integers it seems like we should allow it for numerics too.
It's been like that forever, so back-patch to all supported branches.
Discussion: https://postgr.es/m/30908.1496006184@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 9aa3c782c added code to allow CREATE TABLE/CREATE TYPE to not fail
when the desired type name conflicts with an autogenerated array type, by
dint of renaming the array type out of the way. But I (tgl) overlooked
that the same case arises in ALTER TABLE/TYPE RENAME. Fix that too.
Back-patch to all supported branches.
Report and patch by Vik Fearing, modified a bit by me
Discussion: https://postgr.es/m/0f4ade49-4f0b-a9a3-c120-7589f01d1eb8@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces isspace() calls with scanner_isspace() in functions
that are likely to be presented with non-ASCII input. isspace() has
the small advantage that it will correctly recognize no-break space
in single-byte encodings (such as LATIN1); but it cannot work successfully
for any multibyte character, and depending on platform it might return
false positive results for some fragments of multibyte characters. That's
disastrous for functions that are trying to discard whitespace between
valid strings, as noted in bug #14662 from Justin Muise. Even treating
no-break space as whitespace is pretty questionable for the usages touched
here, because the core scanner would think it is an identifier character.
Affected functions are parse_ident(), parseNameAndArgTypes (underlying
regprocedurein() and siblings), SplitIdentifierString (used for parsing
GUCs and options that are qualified names or lists of names), and
SplitDirectoriesString (used for parsing GUCs that are lists of
directories).
All the functions adjusted here are parsing SQL identifiers and similar
constructs, so it's reasonable to insist that their definition of
whitespace match the core scanner. So we can hope that this won't cause
many backwards-compatibility problems. I've left alone isspace() calls
in places that aren't really expecting any non-ASCII input characters,
such as float8in().
Back-patch to all supported branches.
Discussion: https://postgr.es/m/10129.1495302480@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cash_div_intX functions applied rint() to the result of the division.
That's not merely useless (because the result is already an integer) but
it causes precision loss for values larger than 2^52 or so, because of
the forced conversion to float8.
On the other hand, the cash_mul_fltX functions neglected to apply rint() to
their multiplication results, thus possibly causing off-by-one outputs.
Per C standard, arithmetic between any integral value and a float value is
performed in float format. Thus, cash_mul_flt4 and cash_div_flt4 produced
answers good to only about six digits, even when the float value is exact.
We can improve matters noticeably by widening the float inputs to double.
(It's tempting to consider using "long double" arithmetic if available,
but that's probably too much of a stretch for a back-patched fix.)
Also, document that cash_div_intX operators truncate rather than round.
Per bug #14663 from Richard Pistole. Back-patch to all supported branches.
Discussion: https://postgr.es/m/22403.1495223615@sss.pgh.pa.us
|
|
|
|
| |
Daniel Gustafsson
|
|
|
|
|
| |
This addresses the new warning types -Wformat-truncation
-Wformat-overflow that are part of -Wall, via -Wformat, in GCC 7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CommentStmt made by RebuildConstraintComment() has to pstrdup the
relation name, else it will contain a dangling pointer after that
relcache entry is flushed. (I'm less sure that pstrdup'ing conname
is necessary, but let's be safe.) Failure to do this leads to weird
errors or crashes, as reported by Marko Elezovic.
Bug introduced by commit e42375fc8, so back-patch to 9.5 as that was.
Fix by David Rowley, regression test by Michael Paquier
Discussion: https://postgr.es/m/DB6PR03MB30775D58E732D4EB0C13725B9AE00@DB6PR03MB3077.eurprd03.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before 955a684e0401 logical decoding snapshot maintenance needed to
cope with transactions it might not have seen in their entirety. For
such transactions we'd to assume they modified the catalog (could have
happened before we were watching), and thus a new snapshot had to be
built, and distributed to concurrently running transactions.
That's problematic because building a new snapshot isn't that cheap ,
especially as the the array of committed transactions needs to be
sorted. When creating a slot on a server with a lot of transactions,
this could make logical slot creation infeasibly expensive.
After 955a684e0401 there's no need to deal with transaction that
aren't guaranteed to be fully observable. That allows to avoid
building snapshots for transactions that haven't modified catalog,
even before reaching consistency.
While this isn't necessarily a bugfix, slot creation being impossible
in some production workloads, is severe enough to warrant
backpatching.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The snapshot assembly during the creation of logical slots relied
waiting for transactions in xl_running_xacts to end, by checking for
their commit/abort records. Unfortunately, despite locking, it is
possible to see an xl_running_xact record listing transactions as
ready, that have already WAL-logged an commit/abort record, as the
locking just prevents the ProcArray to be adjusted, and the commit
record has to be logged first.
That lead to either delayed or hanging snapshot creation, because
snapbuild.c would wait "forever" to see commit/abort records for some
transactions. That hang resolved only if a xl_running_xacts record
without any running transactions happened to be logged, far from
certain on a busy server.
It's impractical to prevent that via more heavyweight locking, the
likelihood of deadlocks and significantly increased contention would
be too big.
Instead change the initial snapshot creation to be solely based on
tracking the oldest running transaction via
xl_running_xacts->oldestRunningXid - that actually ends up
significantly simplifying the code. That has two disadvantages:
1) Because we cannot fully "trust" the contents of xl_running_xacts,
we cannot use it to build the initial snapshot. Instead we have to
wait twice for all running transactions to finish.
2) Previously a slot, unless the race occurred, could be created when
the all transaction perceived as running based on commit/abort
records, now we have to wait for the next xl_running_xacts record.
To address that, trigger logging new xl_running_xacts record from
within snapbuild.c exactly when necessary.
Unfortunately snabuild.c's SnapBuild is stored on disk, one of the
stupider ideas of a certain Mr Freund, so we can't change it in a
minor release. As this is going to be backpatched, we have to hack
around a bit to keep on-disk compatibility. A later commit will
rejigger that on master.
Author: Andres Freund, based on a quite different patch from Petr Jelinek
Analyzed-By: Petr Jelinek
Reviewed-By: Petr Jelinek
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backpatch: 9.4-, where logical decoding has been introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have now grown enough registerable syscache-invalidation callback
functions that the original assumption that there would be few of them
is causing performance problems. In particular, let's fix things so that
CallSyscacheCallbacks doesn't have to search the whole array to find
which callback(s) to invoke for a given cache ID. Preserve the original
behavior that callbacks are called in order of registration, just in
case there's someplace that depends on that (which I doubt).
In support of this, export the number of syscaches from syscache.h.
People could have found that out anyway from the enum, but adding a
#define makes that much safer.
This provides a useful additional speedup in Mathieu Fenniak's
logical-decoding test case, although we're reaching the point of
diminishing returns there. I think any further improvement will have
to come from reducing the number of cache invalidations that are
triggered in the first place. Still, we can hope that this change
gives some incremental benefit for all invalidation scenarios.
Back-patch to 9.4 where logical decoding was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A test case provided by Mathieu Fenniak shows that hash_seq_search'ing
this hashtable can consume a very significant amount of overhead during
logical decoding, which triggers frequent cache invalidation. Testing
suggests that the actual population of the hashtable is often no more
than a few dozen entries, so we can cut the overhead just by dropping
the initial number of buckets down from 1024 --- I chose to cut it to 64.
(In situations where we do have a significant number of entries, we
shouldn't get any real penalty from doing this, as the dynahash.c code
will resize the hashtable automatically.)
This gives a further factor-of-two savings in Mathieu's test case.
That may be overly optimistic for real-world benefit, as real cases
may have larger average table populations, but it's hard to see it
turning into a net negative for any workload.
Back-patch to 9.4 where relfilenodemap.c was introduced.
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A test case provided by Mathieu Fenniak shows that the initial search for
the target catcache in CatalogCacheIdInvalidate consumes a very significant
amount of overhead in cases where cache invalidation is triggered but has
little useful work to do. There is no good reason for that search to exist
at all, as the index array maintained by syscache.c allows direct lookup of
the catcache from its ID. We just need a frontend function in syscache.c,
matching the division of labor for most other cache-accessing operations.
While there's more that can be done in this area, this patch alone reduces
the runtime of Mathieu's example by 2X. We can hope that it offers some
useful benefit in other cases too, although usually cache invalidation
overhead is not such a striking fraction of the total runtime.
Back-patch to 9.4 where logical decoding was introduced. It might be
worth going further back, but presently the only case we know of where
cache invalidation is really a significant burden is in logical decoding.
Also, older branches have fewer catcaches, reducing the possible benefit.
(Note: although this nominally changes catcache's API, we have always
documented CatalogCacheIdInvalidate as a private function, so I would
have little sympathy for an external module calling it directly. So
backpatching should be fine.)
Discussion: https://postgr.es/m/CAHoiPjzea6N0zuCi=+f9v_j94nfsy6y8SU7-=bp4=7qw6_i=Rg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increase from the historical value of 32 to 64. We are up to 31 callers
of CacheRegisterSyscacheCallback() in HEAD, so if they were all to be
exercised in one process that would leave only one slot for add-on modules.
It's probably not possible for that to happen, but still we clearly need
more daylight here. (At some point it might be worth making the array
dynamically resizable; but since we've never heard a complaint of "out of
syscache_callback_list slots" happening in the field, I doubt it's worth
it yet.)
Back-patch as far as 9.4, which is where we increased the companion limit
MAX_RELCACHE_CALLBACKS (cf commit f01d1ae3a). It's not as urgent in
released branches, which have only a couple dozen call sites in core, but
it still seems that somebody might hit the limit before these branches die.
Discussion: https://postgr.es/m/12184.1494450131@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8,
not of the type of the column the statistics are for.
This bug is at least partly the fault of sloppy specification comments
for get_attstatsslot()/free_attstatsslot(): the type OID they want is that
of the stavalues entries, not of the underlying column. (I double-checked
other callers and they seem to get this right.) Adjust the comments to be
more correct.
Per buildfarm.
Security: CVE-2017-7484
|
|
|
|
|
|
| |
Oversight in e2d4ef8de et al (my fault not Peter's). Per buildfarm.
Security: CVE-2017-7484
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both views replace the umoptions field with NULL when the user does not
meet qualifications to see it. They used different qualifications, and
pg_user_mappings documented qualifications did not match its implemented
qualifications. Make its documentation and implementation match those
of user_mapping_options. One might argue for stronger qualifications,
but these have long, documented tenure. pg_user_mappings has always
exhibited this problem, so back-patch to 9.2 (all supported versions).
Michael Paquier and Feike Steenbergen. Reviewed by Jeff Janes.
Reported by Andrew Wheelwright.
Security: CVE-2017-7486
|
|
|
|
|
| |
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: dd6f9ed0d9d7b33d328761dcdd3a70f44aaff6ff
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables. Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof. If neither is satisfied, planning
will proceed as if there are no statistics available.
At least one of these is satisfied in most cases in practice. The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.
Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Security: CVE-2017-7484
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we add the SELECT-privilege based policies to the RLS with check
options (such as for an UPDATE statement, or when we have INSERT ...
RETURNING), we need to be sure and use the 'USING' case if the policy is
actually an 'ALL' policy (which could have both a USING clause and an
independent WITH CHECK clause).
This could result in policies acting differently when built using ALL
(when the ALL had both USING and WITH CHECK clauses) and when building
the policies independently as SELECT and UPDATE policies.
Fix this by adding an explicit boolean to add_with_check_options() to
indicate when the USING policy should be used, even if the policy has
both USING and WITH CHECK policies on it.
Reported by: Rod Taylor
Back-patch to 9.5 where RLS was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It only produced <row> elements but no wrapping <table> element.
By contrast, cursor_to_xmlschema produced a schema that is now correct
but did not previously match the XML data produced by cursor_to_xml.
In passing, also fix a minor misunderstanding about moving cursors in
the tests related to this.
Reported-by: filip@jirsak.org
Based-on-patch-by: Thomas Munro <thomas.munro@enterprisedb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GiST's getNextNearest() function attempts to pfree the previously-returned
tuple if any (that is, scan->xs_hitup in HEAD, or scan->xs_itup in older
branches). However, if we are rescanning a plan node after ending a
previous scan early, those tuple pointers could be pointing to garbage,
because they would be pointing into the scan's pageDataCxt or queueCxt
which has been reset. In a debug build this reliably results in a crash,
although I think it might sometimes accidentally fail to fail in
production builds.
To fix, clear the pointer field anyplace we reset a context it might
be pointing into. This may be overkill --- I think probably only the
queueCxt case is involved in this bug, so that resetting in gistrescan()
would be sufficient --- but dangling pointers are generally bad news,
so let's avoid them.
Another plausible answer might be to just not bother with the pfree in
getNextNearest(). The reconstructed tuples would go away anyway in the
context resets, and I'm far from convinced that freeing them a bit earlier
really saves anything meaningful. I'll stick with the original logic in
this patch, but if we find more problems in the same area we should
consider that approach.
Per bug #14641 from Denis Smirnov. Back-patch to 9.5 where this
logic was introduced.
Discussion: https://postgr.es/m/20170504072034.24366.57688@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to a missing CommandCounterIncrement() call, parsing of a non-utility
command in an extension script would not see the effects of the immediately
preceding DDL command, unless that command's execution ends with
CommandCounterIncrement() internally ... which some do but many don't.
Report by Philippe Beaudoin, diagnosis by Julien Rouhaud.
Rather remarkably, this bug has evaded detection since extensions were
invented, so back-patch to all supported branches.
Discussion: https://postgr.es/m/2cf7941e-4e41-7714-3de8-37b1a8f74dff@free.fr
|
|
|
|
|
|
|
|
|
|
| |
Currently, trying to validate a NO INHERIT constraint on the parent will
search for the constraint in child tables (where it is not supposed to
exist), wrongly causing a "constraint does not exist" error.
Amit Langote, per a report from Hans Buschmann.
Discussion: http://postgr.es/m/20170421184012.24362.19@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logical decoding stores historical snapshots on disk, so that logical
decoding can restart without having to reconstruct a snapshot from
scratch (for which the resources are not guaranteed to be present
anymore). These serialized snapshots were also used when creating a
new slot via the walsender interface, which can export a "full"
snapshot (i.e. one that can read all tables, not just catalog ones).
The problem is that the serialized snapshots are only useful for
catalogs and not for normal user tables. Thus the use of such a
serialized snapshot could result in an inconsistent snapshot being
exported, which could lead to queries returning wrong data. This
would only happen if logical slots are created while another logical
slot already exists.
Author: Petr Jelinek
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/f37e975c-908f-858e-707f-058d3b1eb214@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
|
|
|
|
|
|
|
|
|
| |
Commit fa31b6f4e supposed that we didn't have to worry about that
anymore, but it seems that RHEL5 is like that, and that's still
a supported platform. Put back the prior coding under an #ifdef,
adding an explicit fcntl() to retain the desired CLOEXEC property.
Discussion: https://postgr.es/m/12307.1493325329@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logical decoding machinery already preserved all the required
catalog tuples, which is sufficient in the course of normal logical
decoding, but did not guarantee that non-catalog tuples were preserved
during computation of the initial snapshot when creating a slot over
the replication protocol.
This could cause a corrupted initial snapshot being exported. The
time window for issues is usually not terribly large, but on a busy
server it's perfectly possible to it hit it. Ongoing decoding is not
affected by this bug.
To avoid increased overhead for the SQL API, only retain additional
tuples when a logical slot is being created over the replication
protocol. To do so this commit changes the signature of
CreateInitDecodingContext(), but it seems unlikely that it's being
used in an extension, so that's probably ok.
In a drive-by fix, fix handling of
ReplicationSlotsComputeRequiredXmin's already_locked argument, which
should only apply to ProcArrayLock, not ReplicationSlotControlLock.
Reported-By: Erik Rijkers
Analyzed-By: Petr Jelinek
Author: Petr Jelinek, heavily editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/9a897b86-46e1-9915-ee4c-da02e4ff6a95@2ndquadrant.com
Backport: 9.4, where logical decoding was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although the postmaster doesn't currently create a self-pipe or any
latches, there's discussion of it doing so in future. It's also
conceivable that a shared_preload_libraries extension would try to
create such a thing in the postmaster process today. In that case
the self-pipe FDs would be inherited by forked child processes.
latch.c was entirely unprepared for such a case and could suffer an
assertion failure, or worse try to use the inherited pipe if somebody
called WaitLatch without having called InitializeLatchSupport in that
process. Make it keep track of whether InitializeLatchSupport has been
called in the *current* process, and do the right thing if state has
been inherited from a parent.
Apply FD_CLOEXEC to file descriptors created in latch.c (the self-pipe,
as well as epoll event sets). This ensures that child processes spawned
in backends, the archiver, etc cannot accidentally or intentionally mess
with these FDs. It also ensures that we end up with the right state
for the self-pipe in EXEC_BACKEND processes, which otherwise wouldn't
know to close the postmaster's self-pipe FDs.
Back-patch to 9.6, mainly to keep latch.c looking similar in all branches
it exists in.
Discussion: https://postgr.es/m/8322.1493240739@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, maybe_start_bgworker() would launch at most one bgworker
process per call, on the grounds that the postmaster might otherwise
neglect its other duties for too long. However, that seems overly
conservative, especially since bad effects only become obvious when
many hundreds of bgworkers need to be launched at once. On the other
side of the coin is that the existing logic could result in substantial
delay of bgworker launches, because ServerLoop isn't guaranteed to
iterate immediately after a signal arrives. (My attempt to fix that
by using pselect(2) encountered too many portability question marks,
and in any case could not help on platforms without pselect().)
One could also question the wisdom of using an O(N^2) processing
method if the system is intended to support so many bgworkers.
As a compromise, allow that function to launch up to 100 bgworkers
per call (and in consequence, rename it to maybe_start_bgworkers).
This will allow any normal parallel-query request for workers
to be satisfied immediately during sigusr1_handler, avoiding the
question of whether ServerLoop will be able to launch more promptly.
There is talk of rewriting the postmaster to use a WaitEventSet to
avoid the signal-response-delay problem, but I'd argue that this change
should be kept even after that happens (if it ever does).
Backpatch to 9.6 where parallel query was added. The issue exists
before that, but previous uses of bgworkers typically aren't as
sensitive to how quickly they get launched.
Discussion: https://postgr.es/m/4707.1493221358@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loop."
This reverts commit b9515b62879722e3497236f6e0d6783c3fc059a2.
Buildfarm results suggest that some platforms have versions of pselect(2)
that are not merely non-atomic, but flat out non-functional. Revert the
use-pselect patch to confirm this diagnosis (and exclude the no-SA_RESTART
patch as the source of trouble). If it's so, we should probably look into
blacklisting specific platforms that have broken pselect.
Discussion: https://postgr.es/m/9696.1493072081@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally we've unblocked signals, called select(2), and then blocked
signals again. The code expects that the select() will be cancelled with
EINTR if an interrupt occurs; but there's a race condition, which is that
an already-pending signal will be delivered as soon as we unblock, and then
when we reach select() there will be nothing preventing it from waiting.
This can result in a long delay before we perform any action that
ServerLoop was supposed to have taken in response to the signal. As with
the somewhat-similar symptoms fixed by commit 893902085, the main practical
problem is slow launching of parallel workers. The window for trouble is
usually pretty short, corresponding to one iteration of ServerLoop; but
it's not negligible.
To fix, use pselect(2) in place of select(2) where available, as that's
designed to solve exactly this problem. Where not available, we continue
to use the old way, and are no worse off than before.
pselect(2) has been required by POSIX since about 2001, so most modern
platforms should have it. A bigger portability issue is that some
implementations are said to be non-atomic, ie pselect() isn't really
any different from unblock/select/reblock. Still, we're no worse off
than before on such a platform.
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point this
could be reverted, since we'd be using a self-pipe to solve the race
condition. But that's not happening before v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The postmaster keeps signals blocked everywhere except while waiting
for something to happen in ServerLoop(). The code expects that the
select(2) will be cancelled with EINTR if an interrupt occurs; without
that, followup actions that should be performed by ServerLoop() itself
will be delayed. However, some platforms interpret the SA_RESTART
signal flag as meaning that they should restart rather than cancel
the select(2). Worse yet, some of them restart it with the original
timeout delay, meaning that a steady stream of signal interrupts can
prevent ServerLoop() from iterating at all if there are no incoming
connection requests.
Observable symptoms of this, on an affected platform such as HPUX 10,
include extremely slow parallel query startup (possibly as much as
30 seconds) and failure to update timestamps on the postmaster's sockets
and lockfiles when no new connections arrive for a long time.
We can fix this by running the postmaster's signal handlers without
SA_RESTART. That would be quite a scary change if the range of code
where signals are accepted weren't so tiny, but as it is, it seems
safe enough. (Note that postmaster children do, and must, reset all
the handlers before unblocking signals; so this change should not
affect any child process.)
There is talk of rewriting the postmaster to use a WaitEventSet and
not do signal response work in signal handlers, at which point it might
be appropriate to revert this patch. But that's not happening before
v11 at the earliest.
Back-patch to 9.6. The problem exists much further back, but the
worst symptom arises only in connection with parallel query, so it
does not seem worth taking any portability risks in older branches.
Discussion: https://postgr.es/m/9205.1492833041@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This corner case didn't behave nicely at all: the postmaster would
(partially) update its state as though the process had started
successfully, and be quite confused thereafter. Fix it to act
like the worker had crashed, instead.
In passing, refactor so that do_start_bgworker contains all the
state-change logic for bgworker launch, rather than just some of it.
Back-patch as far as 9.4. 9.3 contains similar logic, but it's just
enough different that I don't feel comfortable applying the patch
without more study; and the use of bgworkers in 9.3 was so small
that it doesn't seem worth the extra work.
transam/parallel.c is still entirely unprepared for the possibility
of bgworker startup failure, but that seems like material for a
separate patch.
Discussion: https://postgr.es/m/4905.1492813727@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This seems to be largely cosmetic, avoiding valgrind bleats and the
like. The uninitialized padding influences the CRC of the on-disk
entry, but because it's also used when verifying the CRC, that doesn't
cause spurious failures. Backpatch nonetheless.
It's a bit unfortunate that contrib/test_decoding/sql/replorigin.sql
doesn't exercise the checkpoint path, but checkpoints are fairly
expensive on weaker machines, and we'd have to stop/start for that to
be meaningful.
Author: Andres Freund
Discussion: https://postgr.es/m/20170422183123.w2jgiuxtts7qrqaq@alap3.anarazel.de
Backpatch: 9.5, where replication origins were introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ProcessTwoPhaseBuffer (formerly StandbyRecoverPreparedTransactions)
mixed up the parent and child XIDs when calling SubTransSetParent to
record the transactions' relationship in pg_subtrans.
Remarkably, analysis by Simon Riggs suggests that this doesn't lead to
visible problems (at least, not in non-Assert builds). That might
explain why we'd not noticed it before. Nonetheless, it's surely wrong.
This code was born broken, so back-patch to all supported branches.
Discussion: https://postgr.es/m/20110.1492905318@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The POSIX standard does not say that the success return value for
fcntl(F_SETFD) and fcntl(F_SETFL) is zero; it says only that it's not -1.
We had several calls that were making the stronger assumption. Adjust
them to test specifically for -1 for strict spec compliance.
The standard further leaves open the possibility that the O_NONBLOCK
flag bit is not the only active one in F_SETFL's argument. Formally,
therefore, one ought to get the current flags with F_GETFL and store
them back with only the O_NONBLOCK bit changed when trying to change
the nonblock state. In port/noblock.c, we were doing the full pushup
in pg_set_block but not in pg_set_noblock, which is just weird. Make
both of them do it properly, since they have little business making
any assumptions about the socket they're handed. The other places
where we're issuing F_SETFL are working with FDs we just got from
pipe(2), so it's reasonable to assume the FDs' properties are all
default, so I didn't bother adding F_GETFL steps there.
Also, while pg_set_block deserves some points for trying to do things
right, somebody had decided that it'd be even better to cast fcntl's
third argument to "long". Which is completely loony, because POSIX
clearly says the third argument for an F_SETFL call is "int".
Given the lack of field complaints, these missteps apparently are not
of significance on any common platforms. But they're still wrong,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/30882.1492800880@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were applying the use_physical_tlist optimization to all relation
scan plans, even those implemented by custom scan providers. However,
that's a bad idea for a couple of reasons. The custom provider might
be unable to provide columns that it hadn't expected to be asked for
(for example, the custom scan might depend on an index-only scan).
Even more to the point, there's no good reason to suppose that this
"optimization" is a win for a custom scan; whatever the custom provider
is doing is likely not based on simply returning physical heap tuples.
(As a counterexample, if the custom scan is an interface to a column store,
demanding all columns would be a huge loss.) If it is a win, the custom
provider could make that decision for itself and insert a suitable
pathtarget into the path, anyway.
Per discussion with Dmitry Ivanov. Back-patch to 9.5 where custom scan
support was introduced. The argument that the custom provider can adjust
the behavior by changing the pathtarget only applies to 9.6+, but on
balance it seems more likely that use_physical_tlist will hurt custom
scans than help them.
Discussion: https://postgr.es/m/e29ddd30-8ef9-4da5-a50b-2bb7b8c7198d@postgrespro.ru
|