| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
This oversight could cause logical decoding to fail to decode an outer
transaction containing changes, if a subtransaction had an XID but no
actual changes. Per bug #14279 from Marko Tiikkaja. Patch by Marko
based on analysis by Andrew Gierth.
Discussion: <20160804191757.1430.39011@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In _bt_unlink_halfdead_page(), we might fail to find an immediate left
sibling of the target page, perhaps because of corruption of the page
sibling links. The code intends to cope with this by just abandoning
the deletion attempt; but what actually happens is that it fails outright
due to releasing the same buffer lock twice. (And error recovery masks
a second problem, which is possible leakage of a pin on another page.)
Seems to have been introduced by careless refactoring in commit efada2b8e.
Since there are multiple cases to consider, let's make releasing the buffer
lock in the failure case the responsibility of _bt_unlink_halfdead_page()
not its caller.
Also, avoid fetching the leaf page's left-link again after we've dropped
lock on the page. This is probably harmless, but it's not exactly good
coding practice.
Per report from Kyotaro Horiguchi. Back-patch to 9.4 where the faulty code
was introduced.
Discussion: <20160803.173116.111915228.horiguchi.kyotaro@lab.ntt.co.jp>
|
|
|
|
|
|
|
| |
This is required for the result to be a legal tsvector value.
Noted while fooling with Andreas Seltenreich's ts_delete() crash.
Discussion: <87invhoj6e.fsf@credativ.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Such cases either failed an Assert, or produced a corrupt tsvector in
non-Assert builds, as reported by Andreas Seltenreich. The reason is
that tsvector_delete_by_indices() just assumed that its input array had
no duplicates. Fix by explicitly de-duping.
In passing, improve some comments, and fix a number of tests for null
values to use ERRCODE_NULL_VALUE_NOT_ALLOWED not
ERRCODE_INVALID_PARAMETER_VALUE.
Discussion: <87invhoj6e.fsf@credativ.de>
|
|
|
|
|
| |
Messed up by recent commits --- this is annoying me while trying to fix
some bugs here.
|
|
|
|
|
|
|
|
|
| |
tqual.h is included in some front-end compiles, and a static inline
breaks on buildfarm member castoroides. Since the macro is never
referenced, it should dodge that problem, although this doesn't
seem like the cleanest way of hiding things from front-end compiles.
Report and review by Tom Lane; patch by me.
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned in its commit message, eca0f1db left open a race condition,
where a page could be marked all-visible, after the code checked
PageIsAllVisible() to pin the VM, but before the page is locked. Plug
that hole.
Reviewed-By: Robert Haas, Andres Freund
Author: Amit Kapila
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: -
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some conditions resulted in "return" directly out of a PG_TRY block,
which left the exception stack dangling, and to add insult to injury
failed to restore the state of set_latch_on_sigusr1.
This is a bug only in 9.5; in HEAD it was accidentally fixed by commit
db0f6cad4, which removed the surrounding PG_TRY block. However, I (tgl)
chose to apply the patch to HEAD as well, because the old coding was
gratuitously different from WaitForBackgroundWorkerStartup(), and there
would indeed have been no bug if it were done like that to start with.
Dmitry Ivanov
Discussion: <1637882.WfYN5gPf1A@abook>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we tested for MVCC snapshots to see whether they were too
old, but not TOAST snapshots, which can lead to complaints about missing
TOAST chunks if those chunks are subject to early pruning. Ideally,
the threshold lsn and timestamp for a TOAST snapshot would be that of
the corresponding MVCC snapshot, but since we have no way of deciding
which MVCC snapshot was used to fetch the TOAST pointer, use the oldest
active or registered snapshot instead.
Reported by Andres Freund, who also sketched out what the fix should
look like. Patch by me, reviewed by Amit Kapila.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if an INSERT with multiple rows of VALUES had indirection
(array subscripting or field selection) in its target-columns list, the
parser handled that by applying transformAssignedExpr() to each element
of each VALUES row independently. This led to having ArrayRef assignment
nodes or FieldStore nodes in each row of the VALUES RTE. That works for
simple cases, but in bug #14265 Nuri Boardman points out that it fails
if there are multiple assignments to elements/fields of the same target
column. For such cases to work, rewriteTargetListIU() has to nest the
ArrayRefs or FieldStores together to produce a single expression to be
assigned to the column. But it failed to find them in the top-level
targetlist and issued an error about "multiple assignments to same column".
We could possibly fix this by teaching the rewriter to apply
rewriteTargetListIU to each VALUES row separately, but that would be messy
(it would change the output rowtype of the VALUES RTE, for example) and
inefficient. Instead, let's fix the parser so that the VALUES RTE outputs
are just the user-specified values, cast to the right type if necessary,
and then the ArrayRefs or FieldStores are applied in the top-level
targetlist to Vars representing the RTE's outputs. This is the same
parsetree representation already used for similar cases with INSERT/SELECT
syntax, so it allows simplifications in ruleutils.c, which no longer needs
to treat INSERT-from-multiple-VALUES as its own special case.
This implementation works by applying transformAssignedExpr to the VALUES
entries as before, and then stripping off any ArrayRefs or FieldStores it
adds. With lots of VALUES rows it would be noticeably more efficient to
not add those nodes in the first place. But that's just an optimization
not a bug fix, and there doesn't seem to be any good way to do it without
significant refactoring. (A non-invasive answer would be to apply
transformAssignedExpr + stripping to just the first VALUES row, and then
just forcibly cast remaining rows to the same data types exposed in the
first row. But this way would lead to different, not-INSERT-specific
errors being reported in casting failure cases, so it doesn't seem very
nice.) So leave that for later; this patch at least isn't making the
per-row parsing work worse, and it does make the finished parsetree
smaller, saving rewriter and planner work.
Catversion bump because stored rules containing such INSERTs would need
to change. Because of that, no back-patch, even though this is a very
long-standing bug.
Report: <20160727005725.7438.26021@wrigleys.postgresql.org>
Discussion: <9578.1469645245@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want postmaster child processes to contain a copy of the
postmaster's PostmasterContext. That would be a waste of memory at least,
and at worst a security issue, since there are copies of the semi-sensitive
pg_hba and pg_ident data in there. All other child process types delete
the PostmasterContext after forking, but the original coding of the
background worker patch (commit da07a1e85) did not do so. It appears
that the only reason for that was to avoid copying the bgworker's
MyBgworkerEntry out of that context; but the couple of additional
statements needed to do so are hardly good justification for it. Hence,
copy that data and then clear the context as other child processes do.
Because this patch changes the memory context in which a bgworker function
gains control, back-patching it would be a bit risky, so we won't fix this
in back branches. The "security" complaint is pretty thin anyway for
generic bgworkers; only with the introduction of parallel query is there
any question of running untrusted code in a bgworker process.
Discussion: <14111.1470082717@sss.pgh.pa.us>
|
|
|
|
| |
From: Clément Prévost <prevostclement@gmail.com>
|
|
|
|
| |
Author: Amit Langote
|
|
|
|
|
|
|
|
|
|
|
| |
This is apparently harmless on Windows, but on Unix it results in an
assertion failure. We'd not noticed because this code doesn't get
used on Unix unless you build with -DEXEC_BACKEND. Bug was evidently
introduced by sloppy refactoring in commit 31c453165.
Thomas Munro
Discussion: <CAEepm=1VOnbVx4wsgQFvj94hu9jVt2nVabCr7QiooUSvPJXkgQ@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted by Alvaro, there are CHECK_FOR_INTERRUPTS() calls in the shm_mq.c
functions called by HandleParallelMessages(). I believe they're all
unreachable since we always pass nowait = true, but it doesn't seem like
a great idea to assume that no such call will ever be reachable from
HandleParallelMessages(). If that did happen, there would be a risk of a
recursive call to HandleParallelMessages(), which it does not appear to be
designed for --- for example, there's nothing that would prevent
out-of-order processing of received messages. And certainly such cases
cannot easily be tested. So let's prevent it by holding off interrupts for
the duration of the function. Back-patch to 9.5 which contains identical
code.
Discussion: <14869.1470083848@sss.pgh.pa.us>
|
|
|
|
|
| |
Setting it to 0 is probably not useful in practice, but it allows
testing of situations without available background worker slots.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ParallelMessagePending *must* be marked volatile, because it's set
by a signal handler. On the other hand, it's pointless for
HandleParallelMessageInterrupt to save/restore errno; that must be,
and is, done at the outer level of the SIGUSR1 signal handler.
Calling CHECK_FOR_INTERRUPTS() inside HandleParallelMessages, which itself
is called from CHECK_FOR_INTERRUPTS(), seems both useless and hazardous.
The comment claiming that this is needed to handle the error queue going
away is certainly misguided, in any case.
Improve a couple of error message texts, and use
ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE to report loss of parallel worker
connection, since that's what's used in e.g. tqueue.c. (Maybe it would be
worth inventing a dedicated ERRCODE for this type of failure? But I do not
think ERRCODE_INTERNAL_ERROR is appropriate.)
Minor stylistic cleanups.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This coding pattern creates a race condition, because if an interesting
interrupt happens after we've checked InterruptPending but before we reset
our latch, the latch-setting done by the signal handler would get lost,
and then we might block at WaitLatch in the next iteration without ever
noticing the interrupt condition. You can put the CHECK_FOR_INTERRUPTS
before WaitLatch or after ResetLatch, but not between them.
Aside from fixing the bugs, add some explanatory comments to latch.h
to perhaps forestall the next person from making the same mistake.
In HEAD, also replace gather_readnext's direct call of
HandleParallelMessages with CHECK_FOR_INTERRUPTS. It does not seem clean
or useful for this one caller to bypass ProcessInterrupts and go straight
to HandleParallelMessages; not least because that fails to consider the
InterruptPending flag, resulting in useless work both here
(if InterruptPending isn't set) and in the next CHECK_FOR_INTERRUPTS call
(if it is).
This thinko seems to have been introduced in the initial coding of
storage/ipc/shm_mq.c (commit ec9037df2), and then blindly copied into all
the subsequent parallel-query support logic. Back-patch relevant hunks
to 9.4 to extirpate the error everywhere.
Discussion: <1661.1469996911@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When doing record typmod remapping, tqueue.c did fresh catalog lookups
for each tuple it processed, which was pretty horrible performance-wise
(it seemed to about halve the already none-too-quick speed of bulk reads
in parallel mode). Worse, it insisted on putting bits of that data into
TopMemoryContext, from where it never freed them, causing a
session-lifespan memory leak. (I suppose this was coded with the idea
that the sender process would quit after finishing the query ---
but the receiver uses the same code.)
Restructure to avoid repetitive catalog lookups and to keep that data
in a query-lifespan context, in or below the context where the
TQueueDestReceiver or TupleQueueReader itself lives.
Fix some other bugs such as continuing to use a tupledesc after
releasing our refcount on it. Clean up cavalier datatype choices
(typmods are int32, please, not int, and certainly not Oid). Improve
comments and error message wording.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TupleQueueReaderNext() leaks like a sieve if it has to do any tuple
disassembly/reconstruction. While we could try to clean up its allocations
piecemeal, it seems like a better idea just to insist that it should be run
in a short-lived memory context, so that any transient space goes away
automatically. I chose to have nodeGather.c switch into its existing
per-tuple context before the call, rather than inventing a separate
context inside tqueue.c.
This is sufficient to stop all leakage in the simple case I exhibited
earlier today (see link below), but it does not deal with leaks induced
in more complex cases by tqueue.c's insistence on using TopMemoryContext
for data that it's not actually trying hard to keep track of. That issue
is intertwined with another major source of inefficiency, namely failure
to cache lookup results across calls, so it seems best to deal with it
separately.
In passing, improve some comments, and modify gather_readnext's method for
deciding when it's visited all the readers so that it's more obviously
correct. (I'm not actually convinced that the previous code *is*
correct in the case of a reader deletion; it certainly seems fragile.)
Discussion: <32763.1469821037@sss.pgh.pa.us>
|
|
|
|
| |
It's depressingly clear that nobody ever tested this.
|
|
|
|
| |
Michael Paquier
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we've nailed down the principle that NullTest with !argisrow
is fully equivalent to SQL's IS [NOT] DISTINCT FROM NULL, let's teach
the parser about it. This produces a slightly more compact parse tree
and is much more amenable to optimization than a DistinctExpr, since
the planner knows a good deal about NullTest and next to nothing about
DistinctExpr.
I'm not sure that there are all that many queries in the wild that could
be improved by this, but at least one source of such cases is the patch
just made to postgres_fdw to emit IS [NOT] DISTINCT FROM NULL when
IS [NOT] NULL isn't semantically correct.
No back-patch, since to the extent that this does affect planning results,
it might be considered undesirable plan destabilization.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commits 4452000f3 et al established semantics for NullTest.argisrow that
are a bit different from its initial conception: rather than being merely
a cache of whether we've determined the input to have composite type,
the flag now has the further meaning that we should apply field-by-field
testing as per the standard's definition of IS [NOT] NULL. If argisrow
is false and yet the input has composite type, the construct instead has
the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in
primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print
such cases correctly. In the case of ruleutils.c, this merely results in
cosmetic changes in EXPLAIN output, since the case can't currently arise
in stored rules. However, it represents a live bug for deparse.c, which
would formerly have sent a remote query that had semantics different
from the local behavior. (From the user's standpoint, this means that
testing a remote nested-composite column for null-ness could have had
unexpected recursive behavior much like that fixed in 4452000f3.)
In a related but somewhat independent fix, make plancat.c set argisrow
to false in all NullTest expressions constructed to represent "attnotnull"
constructs. Since attnotnull is actually enforced as a simple null-value
check, this is a more accurate representation of the semantics; we were
previously overpromising what it meant for composite columns, which might
possibly lead to incorrect planner optimizations. (It seems that what the
SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so
arguably we are violating the spec and should fix attnotnull to do the
other thing. If we ever do, this part should get reverted.)
Back-patch, same as the previous commit.
Discussion: <10682.1469566308@sss.pgh.pa.us>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The docs failed to explain that LIKE INCLUDING INDEXES would not preserve
the names of indexes and associated constraints. Also, it wasn't mentioned
that EXCLUDE constraints would be copied by this option. The latter
oversight seems enough of a documentation bug to justify back-patching.
In passing, do some minor copy-editing in the same area, and add an entry
for LIKE under "Compatibility", since it's not exactly a faithful
implementation of the standard's feature.
Discussion: <20160728151154.AABE64016B@smtp.hushmail.com>
|
|
|
|
|
|
|
|
|
|
| |
The keys are integers, not strings. The code accidentally worked on
little-endian machines, at least up to 256 distinct record types within
a session, but failed utterly on big-endian. This was unexpectedly
exposed by a test case added by commit 4452000f3, which apparently is the
only parallelizable query in the regression suite that uses more than one
anonymous record type. Fortunately, buildfarm member mandrill is
big-endian and is running with force_parallel_mode on, so it failed.
|
|
|
|
|
|
|
|
|
|
|
| |
cost_rescan assumed that we don't need to rebuild the hash table when
rescanning a hash join. However, that's currently only true for
single-batch joins; for a multi-batch join we must charge full freight.
This probably has escaped notice because we'd be unlikely to put a hash
join on the inside of a nestloop anyway. Nonetheless, it's wrong.
Fix in HEAD, but don't backpatch for fear of destabilizing plans in
stable releases.
|
|
|
|
|
|
|
| |
There's no point in consulting retval->paramMask; it's always NULL.
Instead, we should consult from->paramMask.
Reported by Andrew Gierth.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ExecMakeTableFunctionResult(), which is used in SELECT FROM function(...)
cases, formerly treated a simple NULL output from a function that both
returnsSet and returnsTuple as a violation of the SRF protocol. What seems
better is to treat a NULL output as equivalent to ROW(NULL,NULL,...).
Without this, cases such as SELECT FROM unnest(...) on an array of
composite are vulnerable to unexpected and not-very-helpful failures.
Old code comments here suggested an alternative of just ignoring
simple-NULL outputs, but that doesn't seem very principled.
This change had been hung up for a long time due to uncertainty about
how much we wanted to buy into the equivalence of simple NULL and
ROW(NULL,NULL,...). I think that's been mostly resolved by the discussion
around bug #14235, so let's go ahead and do it.
Per bug #7808 from Joe Van Dyk. Although this is a pretty old report,
fixing it smells a bit more like a new feature than a bug fix, and the
lack of other similar complaints suggests that we shouldn't take much risk
of destabilization by back-patching. (Maybe that could be revisited once
this patch has withstood some field usage.)
Andrew Gierth and Tom Lane
Report: <E1TurJE-0006Es-TK@wrigleys.postgresql.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, some functions returned various fixed strings and others
failed with a cache lookup error. Per discussion, standardize on
returning NULL. Although user-exposed "cache lookup failed" error
messages might normally qualify for bug-fix treatment, no back-patch;
the risk of breaking user code which is accustomed to the current
behavior seems too high.
Michael Paquier
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SQL standard appears to specify that IS [NOT] NULL's tests of field
nullness are non-recursive, ie, we shouldn't consider that a composite
field with value ROW(NULL,NULL) is null for this purpose.
ExecEvalNullTest got this right, but eval_const_expressions did not,
leading to weird inconsistencies depending on whether the expression
was such that the planner could apply constant folding.
Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be
used as a substitute test if a simple null check is wanted for a rowtype
argument. That motivated reordering things so that IS [NOT] DISTINCT FROM
is described before IS [NOT] NULL. In HEAD, I went a bit further and added
a table showing all the comparison-related predicates.
Per bug #14235. Back-patch to all supported branches, since it's certainly
undesirable that constant-folding should change the semantics.
Report and patch by Andrew Gierth; assorted wordsmithing and revised
regression test cases by me.
Report: <20160708024746.1410.57282@wrigleys.postgresql.org>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The aggfilter expression should be removed from the parent (combining)
Aggref, since it's not supposed to apply the filter, and indeed cannot
because any Vars used in the filter would not be available after the
lower-level aggregation step. Per report from Jeff Janes.
(This has been broken since the introduction of partial aggregation,
I think. The error became obvious after commit 59a3795c2, when setrefs.c
began processing the parent Aggref's fields normally and thus would detect
such Vars. The special-case coding previously used in setrefs.c skipped
over the parent's aggfilter field without processing it. That was broken
in its own way because no other setrefs.c processing got applied either;
though since the executor would not execute the filter expression, only
initialize it, that oversight might not have had any visible symptoms at
present.)
Report: <CAMkU=1xfuPf2edAe4ZGXTmJpU7jxuKukKyvNtEXwu35B7dvejg@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
These functions were added in commits fbe5a3fb7 and a104a017f,
but commit 45639a052 removed their only callers. Put the related
code in foreign.c back to the way it was in 9.5, to avoid pointless
cross-version diffs.
Etsuro Fujita
Patch: <d674a3f1-6b63-519c-ef3f-f3188ed6a178@lab.ntt.co.jp>
|
|
|
|
| |
Michael Paquier
|
|
|
|
|
|
|
|
|
|
| |
runtime.sgml used to contain a table of estimated shared memory consumption
rates for max_connections and some other GUCs. Commit 390bfc643 removed
that on the well-founded grounds that (a) we weren't maintaining the
entries well and (b) it no longer mattered so much once we got out from
under SysV shmem limits. But it missed that there were even-more-obsolete
versions of some of those numbers in comments in postgresql.conf.sample.
Remove those too. Back-patch to 9.3 where the aforesaid commit went in.
|
|
|
|
| |
Omission noted by Andres Freund.
|
|
|
|
| |
Antonin Houska
|
|
|
|
|
| |
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 3d71988dffd3c0798a8864c55ca4b7833b48abb1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since a892234 & fd31cd265 the visibilitymap's freeze bit is used to
avoid vacuuming the whole relation in anti-wraparound vacuums. Doing so
correctly relies on not adding xids to the heap without also unsetting
the visibilitymap flag. Tuple locking related code has not done so.
To allow selectively resetting all-frozen - to avoid pessimizing
heap_lock_tuple - allow to selectively reset the all-frozen with
visibilitymap_clear(). To avoid having to use
visibilitymap_get_status (e.g. via VM_ALL_FROZEN) inside a critical
section, have visibilitymap_clear() return whether any bits have been
reset.
There's a remaining issue (denoted by XXX): After the PageIsAllVisible()
check in heap_lock_tuple() and heap_lock_updated_tuple_rec() the page
status could theoretically change. Practically that currently seems
impossible, because updaters will hold a page level pin already. Due to
the next beta coming up, it seems better to get the required WAL magic
bump done before resolving this issue.
The added flags field fields to xl_heap_lock and xl_heap_lock_updated
require bumping the WAL magic. Since there's already been a catversion
bump since the last beta, that's not an issue.
Reviewed-By: Robert Haas, Amit Kapila and Andres Freund
Author: Masahiko Sawada, heavily revised by Andres Freund
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: -
|
|
|
|
| |
Peter Geoghegan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've broken this code path at least twice in the past, so it's prudent
to have a test case that covers it. To allow exercising the code path
without creating a very large (and slow to run) test case, redefine the
sort threshold to be bounded by maintenance_work_mem as well as the number
of available buffers. While at it, fix an ancient oversight that when
building a temp index, the number of available buffers is not NBuffers but
NLocBuffer. Also, if assertions are enabled, apply a direct test that the
sort actually does return the tuples in the expected order.
Peter Geoghegan
Patch: <CAM3SWZTBAo4hjbBd780+MrOKiKp_TMo1N3A0Rw9_im8gbD7fQA@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
The Assert() here seems unreasonably optimistic. Andreas Seltenreich
found that it could fail with NaNs in the input geometries, and it
seems likely to me that it might fail in corner cases due to roundoff
error, even for ordinary input values. As a band-aid, make the function
return SQL NULL instead of crashing.
Report: <87d1md1xji.fsf@credativ.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When heap_update needs to look for a page for the new tuple version,
because the current one doesn't have sufficient free space, or when
columns have to be processed by the tuple toaster, it has to release the
lock on the old page during that. Otherwise there'd be lock ordering and
lock nesting issues.
To avoid concurrent sessions from trying to update / delete / lock the
tuple while the page's content lock is released, the tuple's xmax is set
to the current session's xid.
That unfortunately was done without any WAL logging, thereby violating
the rule that no XIDs may appear on disk, without an according WAL
record. If the database were to crash / fail over when the page level
lock is released, and some activity lead to the page being written out
to disk, the xid could end up being reused; potentially leading to the
row becoming invisible.
There might be additional risks by not having t_ctid point at the tuple
itself, without having set the appropriate lock infomask fields.
To fix, compute the appropriate xmax/infomask combination for locking
the tuple, and perform WAL logging using the existing XLOG_HEAP_LOCK
record. That allows the fix to be backpatched.
This issue has existed for a long time. There appears to have been
partial attempts at preventing dangers, but these never have fully been
implemented, and were removed a long time ago, in
11919160 (cf. HEAP_XMAX_UNLOGGED).
In master / 9.6, there's an additional issue, namely that the
visibilitymap's freeze bit isn't reset at that point yet. Since that's a
new issue, introduced only in a892234f830, that'll be fixed in a
separate commit.
Author: Masahiko Sawada and Andres Freund
Reported-By: Different aspects by Thomas Munro, Noah Misch, and others
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: 9.1/all supported versions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
0ac5ad5 started to compress infomask bits in WAL records. Unfortunately
the replay routines for XLOG_HEAP_LOCK/XLOG_HEAP2_LOCK_UPDATED forgot to
reset the HEAP_XMAX_INVALID (and some other) hint bits.
Luckily that's not problematic in the majority of cases, because after a
crash/on a standby row locks aren't meaningful. Unfortunately that does
not hold true in the presence of prepared transactions. This means that
after a crash, or after promotion, row level locks held by a prepared,
but not yet committed, prepared transaction might not be enforced.
Discussion: 20160715192319.ubfuzim4zv3rqnxv@alap3.anarazel.de
Backpatch: 9.3, the oldest branch on which 0ac5ad5 is present.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings. Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design. Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value. If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is. Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.
This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied. We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID. In that case, mark the plan as only
runnable by that userID.
The plancache code already had a notion of plans being userID-specific,
in order to support RLS. It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID. Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.
Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one. It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.
In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.
This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.
Etsuro Fujita and Tom Lane
Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When key-share locking a tuple that has been not-key-updated, and the
update is a committed transaction, in some cases we raised
serializability errors:
ERROR: could not serialize access due to concurrent update
Because the key-share doesn't conflict with the update, the error is
unnecessary and inconsistent with the case that the update hasn't
committed yet. This causes problems for some usage patterns, even if it
can be claimed that it's sufficient to retry the aborted transaction:
given a steady stream of updating transactions and a long locking
transaction, the long transaction can be starved indefinitely despite
multiple retries.
To fix, we recognize that HeapTupleSatisfiesUpdate can return
HeapTupleUpdated when an updating transaction has committed, and that we
need to deal with that case exactly as if it were a non-committed
update: verify whether the two operations conflict, and if not, carry on
normally. If they do conflict, however, there is a difference: in the
HeapTupleBeingUpdated case we can just sleep until the concurrent
transaction is gone, while in the HeapTupleUpdated case this is not
possible and we must raise an error instead.
Per trouble report from Olivier Dony.
In addition to a couple of test cases that verify the changed behavior,
I added a test case to verify the behavior that remains unchanged,
namely that errors are raised when a update that modifies the key is
used. That must still generate serializability errors. One
pre-existing test case changes behavior; per discussion, the new
behavior is actually the desired one.
Discussion: https://www.postgresql.org/message-id/560AA479.4080807@odoo.com
https://www.postgresql.org/message-id/20151014164844.3019.25750@wrigleys.postgresql.org
Backpatch to 9.3, where the problem appeared.
|
|
|
|
|
|
|
| |
Digging around bug #14245 I found that commit
6734a1cacd44f5b731933cbc93182b135b167d0c missed that NOT operation is
right associative in opposite to all other. This miss is resposible for
tsquery parser fail on sequence of NOT operations
|