| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary when spgcanreturn() is invoked on a partitioned
index, and the failure might be reachable in other scenarios as
well. The rest of what spgGetCache() does is perfectly sensible
for a partitioned index, so we should allow it to go through.
I think the main takeaway from this is that we lack sufficient test
coverage for non-btree partitioned indexes. Therefore, I added
simple test cases for brin and gin as well as spgist (hash and
gist AMs were covered already in indexing.sql).
Per bug #18256 from Alexander Lakhin. Although the known test case
only fails since v16 (3c569049b), I've got no faith at all that there
aren't other ways to reach this problem; so back-patch to all
supported branches.
Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In v16 and up (since commit afbfc0298), large object ownership
checking has been broken because object_ownercheck() didn't take care
of the discrepancy between our object-address representation of large
objects (classId == LargeObjectRelationId) and the catalog where their
ownership info is actually stored (LargeObjectMetadataRelationId).
This resulted in failures such as "unrecognized class ID: 2613"
when trying to update blob properties as a non-superuser.
Poking around for related bugs, I found that AlterObjectOwner_internal
would pass the wrong classId to the PostAlterHook in the no-op code
path where the large object already has the desired owner. Also,
recordExtObjInitPriv checked for the wrong classId; that bug is only
latent because the stanza is dead code anyway, but as long as we're
carrying it around it should be less wrong. These bugs are quite old.
In HEAD, we can reduce the scope for future bugs of this ilk by
changing AlterObjectOwner_internal's API to let the translation happen
inside that function, rather than requiring callers to know about it.
A more bulletproof fix, perhaps, would be to start using
LargeObjectMetadataRelationId as the dependency and object-address
classId for blobs. However that has substantial risk of breaking
third-party code; even within our own code, it'd create hassles
for pg_dump which would have to cope with a version-dependent
representation. For now, keep the status quo.
Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary. This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.
Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent. This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.
This is broken since the introduction of hot standby in efc16ea52067, so
backpatch all the way down.
Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
| |
Commit 98e675ed7af accidentally mistyped IDENTIFY_SYSTEM as
IDENTIFY_SERVER. Backpatch to all supported branches.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/68138521-5345-8780-4390-1474afdcba1f@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenSSL will sometimes return SSL_ERROR_SYSCALL without having set
errno; this is apparently a reflection of recv(2)'s habit of not
setting errno when reporting EOF. Ensure that we treat such cases
the same as read EOF. Previously, we'd frequently report them like
"could not accept SSL connection: Success" which is confusing, or
worse report them with an unrelated errno left over from some
previous syscall.
To fix, ensure that errno is zeroed immediately before the call,
and report its value only when it's not zero afterwards; otherwise
report EOF.
For consistency, I've applied the same coding pattern in libpq's
pqsecure_raw_read(). Bare recv(2) shouldn't really return -1 without
setting errno, but in case it does we might as well cope.
Per report from Andres Freund. Back-patch to all supported versions.
Discussion: https://postgr.es/m/20231208181451.deqnflwxqoehhxpe@awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3b991f81c45 introduced a specific context for types such
that all no longer referenced types can be dropped periodically
rather than leaking. One void pointer type creation was however
missed leading to an assertion failure in LLVM Debug builds.
Per buildfarm members canebreak and urutu. Fix with assistance
from Andres. The codepath in question was refactored in version
13 hence why this only affected version 12.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1106876.1700409912@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The apply worker needs to update the state of the subscription tables to
'READY' during the synchronization phase which requires locking the
corresponding subscription. The apply worker also waits for the
subscription tables to reach the 'SYNCDONE' state after holding the locks
on the subscription and the wait is done using WaitLatch. The 'SYNCDONE'
state is changed by tablesync workers again by locking the corresponding
subscription. Both the state updates use AccessShareLock mode to lock the
subscription, so they can't block each other. However, a backend can
simultaneously try to acquire a lock on the same subscription using
AccessExclusiveLock mode to alter the subscription. Now, the backend's
wait on a lock can sneak in between the apply worker and table sync worker
causing deadlock.
In other words, apply_worker waits for tablesync worker which waits for
backend, and backend waits for apply worker. This is not detected by the
deadlock detector because apply worker uses WaitLatch.
The fix is to release existing locks in apply worker before it starts to
wait for tablesync worker to change the state.
Reported-by: Tomas Vondra
Author: Shlok Kyal
Reviewed-by: Amit Kapila, Peter Smith
Backpatch-through: 12
Discussion: https://postgr.es/m/d291bb50-12c4-e8af-2af2-7bb9bb4d8e3e@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5a991ef8692e accidentally reversed the order of the tuples
and fields parameters, making the error message incorrectly refer
to 3 tuples with 1 field when IDENTIFY_SYSTEM returns 1 tuple and
3 or 4 fields. Fix by changing the order of the parameters. This
also adds a comment describing why we check for < 3 when postgres
since 9.4 has been sending 4 fields.
Backpatch all the way since the bug is almost a decade old.
Author: Tomonari Katsumata <t.katsumata1122@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18224
Backpatch-through: v12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When creating a partitioned index, the partition key must be a subset
of the index's columns. But this currently doesn't check that the
collations between the partition key and the index definition match.
So you can construct a unique index that fails to enforce uniqueness.
(This would most likely involve a nondeterministic collation, so it
would have to be crafted explicitly and is not something that would
just happen by accident.)
This patch adds the required collation check. As a result, any
previously allowed unique index that has a collation mismatch would no
longer be allowed to be created.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/3327cb54-f7f1-413b-8fdb-7a9dceebb938%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should have done it this way all along, but we accidentally got
away with using the wrong BIO field up until OpenSSL 3.2. There,
the library's BIO routines that we rely on use the "data" field
for their own purposes, and our conflicting use causes assorted
weird behaviors up to and including core dumps when SSL connections
are attempted. Switch to using the approved field for the purpose,
i.e. app_data.
While at it, remove our configure probes for BIO_get_data as well
as the fallback implementation. BIO_{get,set}_app_data have been
there since long before any OpenSSL version that we still support,
even in the back branches.
Also, update src/test/ssl/t/001_ssltests.pl to allow for a minor
change in an error message spelling that evidently came in with 3.2.
Tristan Partin and Bo Andreson. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAN55FZ1eDDYsYaL7mv+oSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the tuple being updated is not visible to the crosscheck snapshot,
we return TM_Updated but the assertions would not hold in that case.
Move them to before the cross-check.
Fixes bug #17893. Backpatch to all supported versions.
Author: Alexander Lakhin
Backpatch-through: 12
Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using GSSAPI encryption in non-blocking mode, libpq sometimes
failed with "GSSAPI caller failed to retransmit all data needing
to be retried". The cause is that pqPutMsgEnd rounds its transmit
request down to an even multiple of 8K, and sometimes that can lead
to not requesting a write of data that was requested to be written
(but reported as not written) earlier. That can upset pg_GSS_write's
logic for dealing with not-yet-written data, since it's possible
the data in question had already been incorporated into an encrypted
packet that we weren't able to send during the previous call.
We could fix this with a one-or-two-line hack to disable pqPutMsgEnd's
round-down behavior, but that seems like making the caller work around
a behavior that pg_GSS_write shouldn't expose in this way. Instead,
adjust pg_GSS_write to never report a partial write: it either
reports a complete write, or reflects the failure of the lower-level
pqsecure_raw_write call. The requirement still exists for the caller
to present at least as much data as on the previous call, but with
the caller-visible write start point not moving there is no temptation
for it to present less. We lose some ability to reclaim buffer space
early, but I doubt that that will make much difference in practice.
This also gets rid of a rather dubious assumption that "any
interesting failure condition (from pqsecure_raw_write) will recur
on the next try". We've not seen failure reports traceable to that,
but I've never trusted it particularly and am glad to remove it.
Make the same adjustments to the equivalent backend routine
be_gssapi_write(). It is probable that there's no bug on the backend
side, since we don't have a notion of nonblock mode there; but we
should keep the logic the same to ease future maintenance.
Per bug #18210 from Lars Kanis. Back-patch to all supported branches.
Discussion: https://postgr.es/m/18210-4c6d0b14627f2eb8@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DROP STATISTICS code failed to properly lock the table, leading to
ERROR: tuple concurrently deleted
when executed concurrently with ANALYZE.
Fixed by modifying RemoveStatisticsById() to acquire the same lock as
ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE
calls RemoveStatisticsDataById() directly.
Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was
like this since it was introduced in 10, but older releases are EOL.
Reported-by: Justin Pryzby
Reviewed-by: Tom Lane
Backpatch-through: 12
Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commits 146604ec43 and a898b409f6 added overflow checks to
interval_mul(), but not to interval_div(), which contains almost
identical code, and so is susceptible to the same kinds of
overflows. In addition, those checks did not catch all possible
overflow conditions.
Add additional checks to the "cascade down" code in interval_mul(),
and copy all the overflow checks over to the corresponding code in
interval_div(), so that they both generate "interval out of range"
errors, rather than returning bogus results.
Given that these errors are relatively easy to hit, back-patch to all
supported branches.
Per bug #18200 from Alexander Lakhin, and subsequent investigation.
Discussion: https://postgr.es/m/18200-5ea288c7b2d504b1%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When performing inlining LLVM unfortunately "leaks" types (the
types survive and are usable, but a new round of inlining will
recreate new structurally equivalent types). This accumulation
will over time amount to a memory leak which for some queries
can be large enough to trigger the OOM process killer.
To avoid accumulation of types, all IR related data is stored
in an LLVMContextRef which is dropped and recreated in order
to release all types. Dropping and recreating incurs overhead,
so it will be done only after 100 queries. This is a heuristic
which might be revisited, but until we can get the size of the
context from LLVM we are flying a bit blind.
This issue has been reported several times, there may be more
references to it in the archives on top of the threads linked
below.
This is a backpatch of 9dce22033d5 to all supported branches.
Reported-By: Justin Pryzby <pryzby@telsasoft.com>
Reported-By: Kurt Roeckx <kurt@roeckx.be>
Reported-By: Jaime Casanova <jcasanov@systemguards.com.ec>
Reported-By: Lauri Laanmets <pcspets@gmail.com>
Author: Andres Freund and Daniel Gustafsson
Discussion: https://postgr.es/m/7acc8678-df5f-4923-9cf6-e843131ae89d@www.fastmail.com
Discussion: https://postgr.es/m/20201218235607.GC30237@telsasoft.com
Discussion: https://postgr.es/m/CAPH-tTxLf44s3CvUUtQpkDr1D8Hxqc2NGDzGXS1ODsfiJ6WSqA@mail.gmail.com
Backpatch-through: v12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This seems more correct, because other before_shmem_exit calls may
expect the infrastructure that is needed to run queries and access the
database to be working, and also because this cleanup has nothing to
do with shared memory.
This is a back-patch of bab150045bd9.
There were no known user-visible consequences to this, though, apart
from what was previous fixed by commit 303640199d0 and back-patched
as commit bcbc27251d35 and commit f7013683d9bb, so bab150045bd9 was
not no back-patched at the time.
Bharath Rupireddy
Discussion: http://postgr.es/m/CALj2ACWk7j4F2v2fxxYfrroOF=AdFNPr1WsV+AGtHAFQOqm_pw@mail.gmail.com
Backpatch-through: 13, 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's clearly stated in the comments that ginFindParents() must keep
the pin on the index's root page that's associated with the topmost
GinBtreeStack item. However, the code path for the case that the
desired downlink has been pushed down to the next index level
ignored this proviso, and would release the pin anyway if we were
still examining the root level. That led to an assertion failure
or "buffer NNNN is not owned by resource owner" error later, when
we try to release the pin again at the end of the insertion.
This is quite hard to reproduce, since it can only happen if an
index root page split occurs concurrently with our own insertion.
Thanks to Jeff Janes for finding a test case that triggers it
often enough to allow investigation.
This has been there since the beginning of GIN, so back-patch
to all supported branches.
Discussion: https://postgr.es/m/CAMkU=1yCAKtv86dMrD__Ja-7KzjE=uMeKX8y__cx5W-OEWy2ow@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit b7eda3e0e moved XidInMVCCSnapshot() from tqual.c into snapmgr.c,
but follow-up commit c91560def incorrectly updated this reference. We
could fix it, but as pointed out by Daniel Gustafsson, 1) the reader can
easily find the file that contains the definition of that function, e.g.
by grepping, and 2) this kind of reference is prone to going stale; so
let's just remove it.
Back-patch to all supported branches.
Reviewed by Daniel Gustafsson.
Discussion: https://postgr.es/m/CAPmGK145VdKkPBLWS2urwhgsfidbSexwY-9zCL6xSUJH%2BBTUUg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
We seem to have accidentally used "insure" in a few places. Correct
that.
Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv0biqrhA3pMhu40aDsj343mTsD75khKnHsLqR8P04f=Q@mail.gmail.com
Backpatch-through: 12, oldest supported version
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
array_set_element() and related functions allow an array to be
enlarged by assigning to subscripts outside the current array bounds.
While these places were careful to check that the new bounds are
allowable, they neglected to consider the risk of integer overflow
in computing the new bounds. In edge cases, we could compute new
bounds that are invalid but get past the subsequent checks,
allowing bad things to happen. Memory stomps that are potentially
exploitable for arbitrary code execution are possible, and so is
disclosure of server memory.
To fix, perform the hazardous computations using overflow-detecting
arithmetic routines, which fortunately exist in all still-supported
branches.
The test cases added for this generate (after patching) errors that
mention the value of MaxArraySize, which is platform-dependent.
Rather than introduce multiple expected-files, use psql's VERBOSITY
parameter to suppress the printing of the message text. v11 psql
lacks that parameter, so omit the tests in that branch.
Our thanks to Pedro Gallegos for reporting this problem.
Security: CVE-2023-5869
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
transformAggregateCall() captures the datatypes of the aggregate's
arguments immediately to construct the Aggref.aggargtypes list.
This seems reasonable because the arguments have already been
transformed --- but there is an edge case where they haven't been.
Specifically, if we have an unknown-type literal in an ANY argument
position, nothing will have been done with it earlier. But if we
also have DISTINCT, then addTargetToGroupList() converts the literal
to "text" type, resulting in the aggargtypes list not matching the
actual runtime type of the argument. The end result is that the
aggregate tries to interpret a "text" value as being of type
"unknown", that is a zero-terminated C string. If the text value
contains no zero bytes, this could result in disclosure of server
memory following the text literal value.
To fix, move the collection of the aggargtypes list to the end
of transformAggregateCall(), after DISTINCT has been handled.
This requires slightly more code, but not a great deal.
Our thanks to Jingzhou Fu for reporting this problem.
Security: CVE-2023-5868
|
|
|
|
|
|
|
|
|
| |
It was always false in single-user mode, in autovacuum workers, and in
background workers. This had no specifically-identified security
consequences, but non-core code or future work might make it
security-relevant. Back-patch to v11 (all supported versions).
Jelte Fennema-Nio. Reported by Jelte Fennema-Nio.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Documentation says it cannot signal "a backend owned by a superuser".
On the contrary, it could signal background workers, including the
logical replication launcher. It could signal autovacuum workers and
the autovacuum launcher. Block all that. Signaling autovacuum workers
and those two launchers doesn't stall progress beyond what one could
achieve other ways. If a cluster uses a non-core extension with a
background worker that does not auto-restart, this could create a denial
of service with respect to that background worker. A background worker
with bugs in its code for responding to terminations or cancellations
could experience those bugs at a time the pg_signal_backend member
chooses. Back-patch to v11 (all supported versions).
Reviewed by Jelte Fennema-Nio. Reported by Hemanth Sandrana and
Mahendrakar Srinivasarao.
Security: CVE-2023-5870
|
|
|
|
|
| |
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: db060e1afcf150db436cc05807372480754013e5
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_explain_guc_options() crashed if a string GUC marked GUC_EXPLAIN
has a NULL boot_val. Nosing around found a couple of other places
that seemed insufficiently cautious about NULL string values, although
those are likely unreachable in practice. Add some commentary
defining the expectations for NULL values of string variables,
in hopes of forestalling future additions of more such bugs.
Xing Guo, Aleksander Alekseev, Tom Lane
Discussion: https://postgr.es/m/CACpMh+AyDx5YUpPaAgzVwC1d8zfOL4JoD-uyFDnNSa1z0EsDQQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
This also updates some C comments.
Reported-by: suchithjn22@gmail.com
Discussion: https://postgr.es/m/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org
Author: Laurenz Albe (doc patch)
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pgstatindex failed with ERRCODE_DATA_CORRUPTED, of the "can't-happen"
class XX. The other functions succeeded on an empty index; they might
have malfunctioned if the failed index build left torn I/O or other
complex state. Report an ERROR in statistics functions pgstatindex,
pgstatginindex, pgstathashindex, and pgstattuple. Report DEBUG1 and
skip all index I/O in maintenance functions brin_desummarize_range,
brin_summarize_new_values, brin_summarize_range, and
gin_clean_pending_list. Back-patch to v11 (all supported versions).
Discussion: https://postgr.es/m/20231001195309.a3@google.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When looping around after finding that the set-returning function
returned zero rows for the current input tuple, ExecProjectSet
neglected to reset either of the two memory contexts it's
responsible for cleaning out. Typically this wouldn't cause much
problem, because once the SRF does return at least one row, the
contexts would get reset on the next call. However, if the SRF
returns no rows for many input tuples in succession, quite a lot
of memory could be transiently consumed.
To fix, make sure we reset both contexts while looping around.
Per bug #18172 from Sergei Kornilov. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18172-9b8c5fc1d676ded3@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While back-patching f90b4a84, I missed that branches before
REL_14_STABLE did some (accidental?) type punning in a function
parameter, and failed to adjust these two branches accordingly. That
didn't seem to cause a problem for newer LLVM versions or non-debug
builds, but older debug builds would fail a type cross-check assertion.
Fix by supplying the correct function argument type. In REL_14_STABLE
the same change was made by commit df99ddc7.
Per build farm animal xenodermus, which runs a debug build of LLVM 6
with jit_above_cost=0.
Discussion: https://postgr.es/m/CA%2BhUKGLQ38rgZ3bvNHXPRjsWFAg3pa%3Dtnpeq0osa%2B%3DmiFD5jAw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids a compiler bug occurring in AIX's xlc, even in pretty
late-model revisions. Buildfarm testing has now confirmed that
only 64-bit xlc is affected. Although we are contemplating
dropping support for xlc in v17, it's still supported in the
back branches, so we need this fix.
Back-patch of code changes from HEAD commit 19fa97731.
(The test cases were already back-patched, in 4a427b82c et al.)
Discussion: https://postgr.es/m/CA+hUKGK=DOC+hE-62FKfZy=Ybt5uLkrg3zCZD-jFykM-iPn8yw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes required by https://llvm.org/docs/NewPassManager.html.
Back-patch to 12, leaving the final release of 11 unchanged, consistent
with earlier decision not to back-patch LLVM 16 support either.
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BWXznXCyTgCADd%3DHWkP9Qksa6chd7L%3DGCnZo-MBgg9Lg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 37d5babb used this C API function while adding support for LLVM
16 and opaque pointers, but it's not available in LLVM 7 and older.
Provide it in our own llvmjit_wrap.cpp. It just calls a C++ function
that pre-dates LLVM 3.9, our minimum target.
Back-patch to 12, like 37d5babb.
Discussion: https://postgr.es/m/CA%2BhUKGKnLnJnWrkr%3D4mSGhE5FuTK55FY15uULR7%3Dzzc%3DwX4Nqw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove use of LLVMGetElementType() and provide the type of all pointers
to LLVMBuildXXX() functions when emitting IR, as required by modern LLVM
versions[1].
* For LLVM <= 14, we'll still use the old LLVMBuildXXX() functions.
* For LLVM == 15, we'll continue to do the same, explicitly opting
out of opaque pointer mode.
* For LLVM >= 16, we'll use the new LLVMBuildXXX2() functions that take
the extra type argument.
The difference is hidden behind some new IR emitting wrapper functions
l_load(), l_gep(), l_call() etc. The change is mostly mechanical,
except that at each site the correct type had to be provided.
In some places we needed to do some extra work to get functions types,
including some new wrappers for C++ APIs that are not yet exposed by in
LLVM's C API, and some new "example" functions in llvmjit_types.c
because it's no longer possible to start from the function pointer type
and ask for the function type.
Back-patch to 12, because it's a little tricker in 11 and we agreed not
to put the latest LLVM support into the upcoming final release of 11.
[1] https://llvm.org/docs/OpaquePointers.html
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Ronan Dunklau <ronan.dunklau@aiven.io>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGKNX_%3Df%2B1C4r06WETKTq0G4Z_7q4L4Fxn5WWpMycDj9Fw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SIGTERM handler for the startup process immediately calls
proc_exit() for the duration of the restore_command, i.e., a call
to system(). This system() call forks a new process to execute the
shell command, and this child process inherits the parent's signal
handlers. If both the parent and child processes receive SIGTERM,
both will attempt to call proc_exit(). This can end badly. For
example, both processes will try to remove themselves from the
PGPROC shared array.
To fix this problem, this commit adds a check in
StartupProcShutdownHandler() to see whether MyProcPid == getpid().
If they match, this is the parent process, and we can proc_exit()
like before. If they do not match, this is a child process, and we
just emit a message to STDERR (in a signal safe manner) and
_exit(), thereby skipping any problematic exit callbacks.
This commit also adds checks in proc_exit(), ProcKill(), and
AuxiliaryProcKill() that verify they are not being called within
such child processes.
Suggested-by: Andres Freund
Reviewed-by: Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/Y9nGDSgIm83FHcad%40paquier.xyz
Discussion: https://postgr.es/m/20230223231503.GA743455%40nathanxps13
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dropping a temp table could entail TOAST table access to clean out
toasted catalog entries, such as large pg_constraint.conbin strings
for complex CHECK constraints. If we did that via ON COMMIT DROP,
we triggered the assertion in init_toast_snapshot(), because
there was no provision for setting up a snapshot for the drop
actions. Fix that.
(I assume here that the adjacent truncation actions for ON COMMIT
DELETE ROWS don't have a similar problem: it doesn't seem like
nontransactional truncations would need to touch any toasted fields.
If that proves wrong, we could refactor a bit to have the same
snapshot acquisition cover that too.)
The test case added here does not fail before v15, because that
assertion was added in 277692220 which was not back-patched.
However, the race condition the assertion warns of surely
exists further back, so back-patch to all supported branches.
Per report from Richard Guo.
Discussion: https://postgr.es/m/CAMbWs4-x26=_QxxgdJyNbiCDzvtr2WV5ZDso_v-CukKEe6cBZw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit dc7d70ea added functions that read the control file, but didn't
acquire ControlFileLock. With unlucky timing, file systems that have
weak interlocking like ext4 and ntfs could expose partially overwritten
contents, and the checksum would fail.
Back-patch to all supported releases.
Reviewed-by: David Steele <david@pgmasters.net>
Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This could only affect HASH partitioned tables with at least 2 partition
key columns.
If partition pruning was delayed until execution and the query contained
an IS NULL qual on one of the partitioned keys, and some subsequent
partitioned key was being compared to a non-Const, then this could result
in a crash due to the incorrect keyno being used to calculate the
stateidx for the expression evaluation code.
Here we fix this by properly skipping partitioned keys which have a
nullkey set. Effectively, this must be the same as what's going on
inside perform_pruning_base_step().
Sergei Glukhov also provided a patch, but that's not what's being used
here.
Reported-by: Sergei Glukhov
Reviewed-by: tender wang, Sergei Glukhov
Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru
Backpatch-through: 11, where runtime partition pruning was added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_steps_using_prefix_recurse() incorrectly assumed that it could stop
recursive processing of the 'prefix' list when cur_keyno was one before
the step_lastkeyno. Since hash partition pruning can prune using IS
NULL quals, and these IS NULL quals are not present in the 'prefix'
list, then that logic could cause more levels of recursion than what is
needed and lead to there being no more items in the 'prefix' list to
process. This would manifest itself as a crash in some code that
expected the 'start' ListCell not to be NULL.
Here we adjust the logic so that instead of stopping recursion at 1 key
before the step_lastkeyno, we just look at the llast(prefix) item and
ensure we only recursively process up until just before whichever the last
key is. This effectively allows keys to be missing in the 'prefix' list.
This change does mean that step_lastkeyno is no longer needed, so we
remove that from the static functions. I also spent quite some time
reading this code and testing it to try to convince myself that there
are no other issues. That resulted in the irresistible temptation of
rewriting some comments, many of which were just not true or inconcise.
Reported-by: Sergei Glukhov
Reviewed-by: Sergei Glukhov, tender wang
Discussion: https://postgr.es/m/2f09ce72-315e-2a33-589a-8519ada8df61@postgrespro.ru
Backpatch-through: 11, where partition pruning was introduced.
|
|
|
|
|
|
|
|
| |
Mark the buffers dirty before writing WAL.
Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c094a@iki.fi
Reviewed-by: Heikki Linnakangas
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code in charge of copying the contents of PgBackendStatus to local
memory could fail on memory allocation because of an overflow on the
amount of memory to use. The overflow can happen when combining a high
value track_activity_query_size (max at 1MB) with a large
max_connections, when both multiplied get higher than INT32_MAX as both
parameters treated as signed integers. This could for example trigger
with the following functions, all calling pgstat_read_current_status():
- pg_stat_get_backend_subxact()
- pg_stat_get_backend_idset()
- pg_stat_get_progress_info()
- pg_stat_get_activity()
- pg_stat_get_db_numbackends()
The change to use MemoryContextAllocHuge() has been introduced in
8d0ddccec636, so backpatch down to 12.
Author: Jakub Wartak
Discussion: https://postgr.es/m/CAKZiRmw8QSNVw2qNK-dznsatQqz+9DkCquxP0GHbbv1jMkGHMA@mail.gmail.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes the WAL reader routines so as a FATAL for the
backend or exit(FAILURE) for the frontend is triggered if an allocation
for a WAL record decode fails in walreader.c, rather than treating this
case as bogus data, which would be equivalent to the end of WAL. The
key is to avoid palloc_extended(MCXT_ALLOC_NO_OOM) in walreader.c,
relying on plain palloc() calls.
The previous behavior could make WAL replay finish too early than it
should. For example, crash recovery finishing earlier may corrupt
clusters because not all the WAL available locally was replayed to
ensure a consistent state. Out-of-memory failures would show up
randomly depending on the memory pressure on the host, but one simple
case would be to generate a large record, then replay this record after
downsizing a host, as Ethan Mertz originally reported.
This relies on bae868caf222, as the WAL reader routines now do the
memory allocation required for a record only once its header has been
fully read and validated, making xl_tot_len trustable. Making the WAL
reader react differently on out-of-memory or bogus record data would
require ABI changes, so this is the safest choice for stable branches.
Also, it is worth noting that 3f1ce973467a has been using a plain
palloc() in this code for some time now.
Thanks to Noah Misch and Thomas Munro for the discussion.
Like the other commit, backpatch down to 12, leaving out v11 that will
be EOL'd soon. The behavior of considering a failed allocation as bogus
data comes originally from 0ffe11abd3a0, where the record length
retrieved from its header was not entirely trustable.
Reported-by: Ethan Mertz
Discussion: https://postgr.es/m/ZRKKdI5-RRlta3aF@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After receiving position data for a lexeme, tsvectorrecv()
advanced its "datalen" value by (npos+1)*sizeof(WordEntry)
where the correct calculation is (npos+1)*sizeof(WordEntryPos).
This accidentally failed to render the constructed tsvector
invalid, but it did result in leaving some wasted space
approximately equal to the space consumed by the position data.
That could have several bad effects:
* Disk space is wasted if the received tsvector is stored into a
table as-is.
* A legal tsvector could get rejected with "maximum total lexeme
length exceeded" if the extra space pushes it over the MAXSTRPOS
limit.
* In edge cases, the finished tsvector could be assigned a length
larger than the allocated size of its palloc chunk, conceivably
leading to SIGSEGV when the tsvector gets copied somewhere else.
The odds of a field failure of this sort seem low, though valgrind
testing could probably have found this.
While we're here, let's express the calculation as
"sizeof(uint16) + npos * sizeof(WordEntryPos)" to avoid the type
pun implicit in the "npos + 1" formulation. It's not wrong
given that WordEntryPos had better be 2 bytes to avoid padding
problems, but it seems clearer this way.
Report and patch by Denis Erokhin. Back-patch to all supported
versions.
Discussion: https://postgr.es/m/009801d9f2d9$f29730c0$d7c59240$@datagile.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nbtree's mark/restore processing failed to correctly handle an edge case
involving array key advancement and related search-type scan key state.
Scans with ScalarArrayScalarArrayOpExpr quals requiring mark/restore
processing (for a merge join) could incorrectly conclude that an
affected array/scan key must not have advanced during the time between
marking and restoring the scan's position.
As a result of all this, array key handling within btrestrpos could skip
a required call to _bt_preprocess_keys(). This confusion allowed later
primitive index scans to overlook tuples matching the true current array
keys. The scan's search-type scan keys would still have spurious values
corresponding to the final array element(s) -- not values matching the
first/now-current array element(s).
To fix, remember that "array key wraparound" has taken place during the
ongoing btrescan in a flag variable stored in the scan's state, and use
that information at the point where btrestrpos decides if another call
to _bt_preprocess_keys is required.
Oversight in commit 70bc5833, which taught nbtree to handle array keys
during mark/restore processing, but missed this subtlety. That commit
was itself a bug fix for an issue in commit 9e8da0f7, which taught
nbtree to handle ScalarArrayOpExpr quals natively.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkgP3DDRJxw6DgjCxo-cu-DKrvjEv_ArkP2ctBJatDCYg@mail.gmail.com
Backpatch: 11- (all supported branches).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code was sloppy about comparison of index columns that
are expressions. It didn't reliably reject cases where one
index has an expression where the other has a plain column,
and it could index off the start of the attmap array, leading
to a Valgrind complaint (though an actual crash seems unlikely).
I'm not sure that the expression-vs-column sloppiness leads
to any visible problem in practice, because the subsequent
comparison of the two expression lists would reject cases
where the indexes have different numbers of expressions
overall. Maybe we could falsely match indexes having the
same expressions in different column positions, but it'd
require unlucky contents of the word before the attmap array.
It's not too surprising that no problem has been reported
from the field. Nonetheless, this code is clearly wrong.
Per bug #18135 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/18135-532f4a755e71e4d2@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Yet another bug in the ilk of commits a7ee7c851 and 741b88435. In
741b88435, we took care to clear the memorized location of the
downlink when we split the parent page, because splitting the parent
page can move the downlink. But we missed that even *updating* a tuple
on the parent can move it, because updating a tuple on a gist page is
implemented as a delete+insert, so the updated tuple gets moved to the
end of the page.
This commit fixes the bug in two different ways (belt and suspenders):
1. Clear the downlink when we update a tuple on the parent page, even
if it's not split. This the same approach as in commits a7ee7c851
and 741b88435.
I also noticed that gistFindCorrectParent did not clear the
'downlinkoffnum' when it stepped to the right sibling. Fix that
too, as it seems like a clear bug even though I haven't been able
to find a test case to hit that.
2. Change gistFindCorrectParent so that it treats 'downlinkoffnum'
merely as a hint. It now always first checks if the downlink is
still at that location, and if not, it scans the page like before.
That's more robust if there are still more cases where we fail to
clear 'downlinkoffnum' that we haven't yet uncovered. With this,
it's no longer necessary to meticulously clear 'downlinkoffnum',
so this makes the previous fixes unnecessary, but I didn't revert
them because it still seems nice to clear it when we know that the
downlink has moved.
Also add the test case using the same test data that Alexander
posted. I tried to reduce it to a smaller test, and I also tried to
reproduce this with different test data, but I was not able to, so
let's just include what we have.
Backpatch to v12, like the previous fixes.
Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/18129-caca016eaf0c3702@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bae868ca removed a check that was still needed. If you had an
xl_tot_len at the end of a page that was too small for a record header,
but not big enough to span onto the next page, we'd immediately perform
the CRC check using a bogus large length. Because of arbitrary coding
differences between the CRC implementations on different platforms,
nothing very bad happened on common modern systems. On systems using
the _sb8.c fallback we could segfault.
Restore that check, add a new assertion and supply a test for that case.
Back-patch to 12, like bae868ca.
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLCkTT7zYjzOxuLGahBdQ%3DMcF%3Dz5ZvrjSOnW4EDhVjT-g%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parse analysis of a CallStmt will inject mutable information,
for instance the OID of the called procedure, so that subsequent
DDL may create a need to re-parse the CALL. We failed to detect
this for CALLs in plpgsql routines, because no dependency information
was collected when putting a CallStmt into the plan cache. That
could lead to misbehavior or strange errors such as "cache lookup
failed".
Before commit ee895a655, the issue would only manifest for CALLs
appearing in atomic contexts, because we re-planned non-atomic
CALLs every time through anyway.
It is now apparent that extract_query_dependencies() probably
needs a special case for every utility statement type for which
stmt_requires_parse_analysis() returns true. I wanted to add
something like Assert(!stmt_requires_parse_analysis(...)) when
falling out of extract_query_dependencies_walker without doing
anything, but there are API issues as well as a more fundamental
point: stmt_requires_parse_analysis is supposed to be applied to
raw parser output, so it'd be cheating to assume it will give the
correct answer for post-parse-analysis trees. I contented myself
with adding a comment.
Per bug #18131 from Christian Stork. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18131-576854e79c5cd264@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial estimate of the number of distinct ParsedWords is just
that: an estimate. Don't let it exceed what palloc is willing to
allocate. If in fact we need more entries, we'll eventually fail
trying to enlarge the array. But if we don't, this allows success on
inputs that currently draw "invalid memory alloc request size".
Per bug #18080 from Uwe Binder. Back-patch to all supported branches.
Discussion: https://postgr.es/m/18080-d5c5e58fef8c99b7@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
xl_tot_len comes first in a WAL record. Usually we don't trust it to be
the true length until we've validated the record header. If the record
header was split across two pages, previously we wouldn't do the
validation until after we'd already tried to allocate enough memory to
hold the record, which was bad because it might actually be garbage
bytes from a recycled WAL file, so we could try to allocate a lot of
memory. Release 15 made it worse.
Since 70b4f82a4b5, we'd at least generate an end-of-WAL condition if the
garbage 4 byte value happened to be > 1GB, but we'd still try to
allocate up to 1GB of memory bogusly otherwise. That was an
improvement, but unfortunately release 15 tries to allocate another
object before that, so you could get a FATAL error and recovery could
fail.
We can fix both variants of the problem more fundamentally using
pre-existing page-level validation, if we just re-order some logic.
The new order of operations in the split-header case defers all memory
allocation based on xl_tot_len until we've read the following page. At
that point we know that its first few bytes are not recycled data, by
checking its xlp_pageaddr, and that its xlp_rem_len agrees with
xl_tot_len on the preceding page. That is strong evidence that
xl_tot_len was truly the start of a record that was logged.
This problem was most likely to occur on a standby, because
walreceiver.c recycles WAL files without zeroing out trailing regions of
each page. We could fix that too, but it wouldn't protect us from rare
crash scenarios where the trailing zeroes don't make it to disk.
With reliable xl_tot_len validation in place, the ancient policy of
considering malloc failure to indicate corruption at end-of-WAL seems
quite surprising, but changing that is left for later work.
Also included is a new TAP test to exercise various cases of end-of-WAL
detection by writing contrived data into the WAL from Perl.
Back-patch to 12. We decided not to put this change into the final
release of 11.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com> (the idea, not the code)
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17928-aa92416a70ff44a2%40postgresql.org
|