| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As in commits 6301c3ada and e9d9ba2a4, avoid doing repetitive
list_delete_first() operations, since that would be expensive when
there are many files waiting to be unlinked. This is a slightly
larger change than in those cases. We have to keep the list state
valid for calls to AbsorbSyncRequests(), so it's necessary to invent a
"canceled" field instead of immediately deleting PendingUnlinkEntry
entries. Also, because we might not be able to process all the
entries, we need a new list primitive list_delete_first_n().
list_delete_first_n() is almost list_copy_tail(), but it modifies the
input List instead of making a new copy. I found a couple of existing
uses of the latter that could profitably use the new function. (There
might be more, but the other callers look like they probably shouldn't
overwrite the input List.)
As before, back-patch to v13.
Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the same spirit as 6301c3ada, fix some more places where we were
using list_delete_first() in a loop and thereby risking O(N^2)
behavior. It's not clear that the lists manipulated in these spots
can get long enough to be really problematic ... but it's not clear
that they can't, either, and the fixes are simple enough.
As before, back-patch to v13.
Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
|
|
|
|
|
|
|
|
|
| |
Failing to do so results in inability of logical decoding to process the
WAL stream. Handle it by doing nothing.
Backpatch all the way back.
Reported-by: Petr Jelínek <petr.jelinek@enterprisedb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The opclass parameter Datums from the old index are fetched in the same
way as for predicates and expressions, by grabbing them directly from
the system catalogs. They are then copied into the new IndexInfo that
will be used for the creation of the new copy.
This caused the new index to be rebuilt with default parameters rather
than the ones pre-defined by a user. The only way to get back a new
index with correct opclass parameters would be to recreate a new index
from scratch.
The issue has been introduced by 911e702.
Author: Michael Paquier
Reviewed-by: Zhihong Yu
Discussion: https://postgr.es/m/YX0CG/QpLXcPr8HJ@paquier.xyz
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When replaying a transaction that held many exclusive locks on the
primary, a standby server's startup process would expend O(N^2)
effort on manipulating the list of locks. This code was fine when
written, but commit 1cff1b95a made repetitive list_delete_first()
calls inefficient, as explained in its commit message. Fix by just
iterating the list normally, and releasing storage only when done.
(This'd be inadequate if we needed to recover from an error occurring
partway through; but we don't.)
Back-patch to v13 where 1cff1b95a came in.
Nathan Bossart
Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit d168b66682, which overhauled index deletion, added a
pg_unreachable() to the end of a sort comparator used when sorting heap
TIDs from an index page. This allows the compiler to apply
optimizations that assume that the heap TIDs from the index AM must
always be unique.
That doesn't seem like a good idea now, given recent reports of
corruption involving duplicate TIDs in indexes on Postgres 14. Demote
to an assertion, just in case.
Backpatch: 14-, where index deletion was overhauled.
|
|
|
|
|
|
| |
Oversight in commit a5213adf.
Backpatch: 13-, just like commit a5213adf.
|
|
|
|
|
|
|
|
|
|
|
| |
Add more defensive checks around posting list split code. These should
detect corruption involving duplicate table TIDs earlier and more
reliably than any existing check.
Follow up to commit 8f72bbac.
Discussion: https://postgr.es/m/CAH2-WzkrSY_kjyd1_M5xJK1uM0govJXMxPn8JUSvwcUOiHuWVw@mail.gmail.com
Backpatch: 13-, where nbtree deduplication was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous coding relied on the memory for the slots being zeroed
elsewhere, which while it was true in this case is not an contract
which is guaranteed to hold. Explicitly clear the tts_isnull array
to ensure that the slots are filled from a known state.
Backpatch to v14 where the catalog multi-inserts were introduced.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAJ7c6TP0AowkUgNL6zcAK-s5HYsVHVBRWfu69FRubPpfwZGM9A@mail.gmail.com
Backpatch-through: 14
|
|
|
|
|
|
|
|
| |
This reverts commit 671eb8f34404d24c8f16ae40e94becb38afd93bb. The removed
wait events are used by some extensions and removal of these would force a
recompile of those extensions. We don't want that for released branches.
Discussion: https://postgr.es/m/E1mdOBY-0005j2-QL@gemulon.postgresql.org
|
|
|
|
|
|
|
|
|
| |
It doesn't work (it could, but hasn't been implemented).
Back-patch to 12, where shared_memory_type arrived.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/163271880203.22789.1125998876173795966@wrigleys.postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of commit 8a54e12a38d1545d249f1402f66c8cde2837d97c was to
fix this, and it sufficed when the PREPARE TRANSACTION completed before
the CIC looked for lock conflicts. Otherwise, things still broke. As
before, in a cluster having used CIC while having enabled prepared
transactions, queries that use the resulting index can silently fail to
find rows. It may be necessary to reindex to recover from past
occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index
builds by making CIC wait for arbitrarily-recent prepared transactions
and for ordinary transactions that may yet PREPARE TRANSACTION. As part
of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC
before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all
supported versions).
Andrey Borodin, reviewed (in earlier versions) by Andres Freund.
Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CIC and REINDEX CONCURRENTLY assume backends see their catalog changes
no later than each backend's next transaction start. That failed to
hold when a backend absorbed a relevant invalidation in the middle of
running RelationBuildDesc() on the CIC index. Queries that use the
resulting index can silently fail to find rows. Fix this for future
index builds by making RelationBuildDesc() loop until it finishes
without accepting a relevant invalidation. It may be necessary to
reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices.
Back-patch to 9.6 (all supported versions).
Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres
Freund.
Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
| |
This was originally done in commit 5e77625b26 for 15 only, as a
troubleshooting aid but multiple people showed interest in back-patching
this.
Author: Jeremy Schneider
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com
|
|
|
|
|
|
|
|
|
| |
Commit 464824323e introduced the wait events which were neither used by
that commit nor by follow-up commits for that work.
Author: Masahiro Ikeda
Backpatch-through: 14, where it was introduced
Discussion: https://postgr.es/m/ff077840-3ab2-04dd-bbe4-4f5dfd2ad481@oss.nttdata.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using for a new database a template database with shared dependencies
that need to be copied over was causing a corruption of pg_shdepend
because of an off-by-one computation error of the index number used for
the values inserted with a slot.
Issue introduced by e3931d0. Monitoring the rest of the code, there are
no similar mistakes.
Reported-by: Sven Klemm
Author: Aleksander Alekseev
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/CAJ7c6TP0AowkUgNL6zcAK-s5HYsVHVBRWfu69FRubPpfwZGM9A@mail.gmail.com
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1b5d797cd4f7 intended to relax the lock level used to rename
indexes, but inadvertently allowed *any* relation to be renamed with a
lowered lock level, as long as the command is spelled ALTER INDEX.
That's undesirable for other relation types, so retry the operation with
the higher lock if the relation turns out not to be an index.
After this fix, ALTER INDEX <sometable> RENAME will require access
exclusive lock, which it didn't before.
Author: Nathan Bossart <bossartn@amazon.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Onder Kalaci <onderk@microsoft.com>
Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An update such as "UPDATE ... SET fld[n].subfld = whatever"
failed if the array elements were domains rather than plain
composites. That's because isAssignmentIndirectionExpr()
failed to cope with the CoerceToDomain node that would appear
in the expression tree in this case. The result would typically
be a crash, and even if we accidentally didn't crash, we'd not
correctly preserve other fields of the same array element.
Per report from Onder Kalaci. Back-patch to v11 where arrays of
domains came in.
Discussion: https://postgr.es/m/PH0PR21MB132823A46AA36F0685B7A29AD8BD9@PH0PR21MB1328.namprd21.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think when I added this assertion (in commit 8f889b108), I was only
thinking of the use of transformExpressionList at top level of INSERT
and VALUES. But it's also called by transformRowExpr(), which can
certainly occur in an UPDATE targetlist, so it's inappropriate to
suppose that p_multiassign_exprs must be empty. Besides, since the
input is not expected to contain ResTargets, there's no reason it
should contain MultiAssignRefs either. Hence this code need not
be concerned about the state of p_multiassign_exprs, and we should
just drop the assertion.
Per bug #17236 from ocean_li_996. It's been wrong for years,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17236-3210de9bcba1d7ca@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The grammar of this command run on indexes with column names has always
been authorized by the parser, and it has never been documented.
Since 911e702, it is possible to define opclass parameters as of CREATE
INDEX, which actually broke the old case of ALTER INDEX/TABLE where
relation-level parameters n_distinct and n_distinct_inherited could be
defined for an index (see 76a47c0 and its thread where this point has
been touched, still remained unused). Attempting to do that in v13~
would cause the index to become unusable, as there is a new dedicated
code path to load opclass parameters instead of the relation-level ones
previously available. Note that it is possible to fix things with a
manual catalog update to bring the relation back online.
This commit disables this command for now as the use of column names for
indexes does not make sense anyway, particularly when it comes to index
expressions where names are automatically computed. One way to properly
support this case properly in the future would be to use column numbers
when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN
.. SET STATISTICS.
Partitioned indexes were already blocked, but not indexes. Some tests
are added for both cases.
There was some code in ANALYZE to enforce n_distinct to be used for an
index expression if the parameter was defined, but just remove it for
now until/if there is support for this (note that index-level parameters
never had support in pg_dump either, previously), so this was just dead
code.
Reported-by: Matthijs van der Vleuten
Author: Nathan Bossart, Michael Paquier
Reviewed-by: Vik Fearing, Dilip Kumar
Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failing to do that, any direct inserts/updates of those partitions
would fail to enforce the correct constraint, that is, one that
considers the new partition constraint of their parent table.
Backpatch to 10.
Reported by: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a replication slot creation, an ERROR generated in the same
transaction as the one creating a to-be-exported snapshot would have
left the backend in an inconsistent state, as the associated static
export snapshot state was not being reset on transaction abort, but only
on the follow-up command received by the WAL sender that created this
snapshot on replication slot creation. This would trigger inconsistency
failures if this session tried to export again a snapshot, like during
the creation of a replication slot.
Note that a snapshot export cannot happen in a transaction block, so
there is no need to worry resetting this state for subtransaction
aborts. Also, this inconsistent state would very unlikely show up to
users. For example, one case where this could happen is an
out-of-memory error when building the initial snapshot to-be-exported.
Dilip found this problem while poking at a different patch, that caused
an error in this code path for reasons unrelated to HEAD.
Author: Dilip Kumar
Reviewed-by: Michael Paquier, Zhihong Yu
Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
|
| |
An extension may want to call GetSecurityLabel() on a shared object
before the shared relcaches are fully initialized. For instance, a
ClientAuthentication_hook might want to retrieve the security label on
a role.
Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a function-in-FROM laterally references the output of some sub-SELECT
earlier in the FROM clause, and we are able to flatten that sub-SELECT
into the outer query, the expression(s) copied into the function RTE
missed being processed by eval_const_expressions. This'd lead to trouble
and probable crashes at execution if such expressions contained
named-argument function call syntax or functions with defaulted arguments.
The bug is masked if the query contains any explicit JOIN syntax, which
may help explain why we'd not noticed.
Per bug #17227 from Bernd Dorn. This is an oversight in commit 7266d0997,
so back-patch to v13 where that came in.
Discussion: https://postgr.es/m/17227-5a28ed1512189fa4@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code was freeing the name of the multirange type function stored in
the parse tree but it should not do that. Event triggers could for
example look at such a corrupted parsed tree with a ddl_command_end
event.
Author: Alex Kozhemyakin, Sergey Shinderuk
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/d5042d46-b9cd-6efb-219a-71ed0cf45bc8@postgrespro.ru
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously when pg_log_backend_memory_contexts() sent the request to
the autovacuum launcher, it could take more than several seconds to
log its memory contexts. Because the function (HandleAutoVacLauncherInterrupts)
to process any new interrupts that autovacuum launcher received
didn't handle the request for logging of memory contexts. This commit changes
the function so that it handles the request, to make autovacuum launcher
more responsitve to pg_log_backend_memory_contexts().
Back-patch to v14 where pg_log_backend_memory_contexts() was added.
Author: Koyu Tanigawa
Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
Discussion: https://postgr.es/m/0aae3e074face409b35153451be5cc11@oss.nttdata.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3f50b8263 had an oversight: formerly, to deparse expressions
attached to a plan node, it was only necessary to update the
deparse_namespace ancestors list alongside calling set_deparse_plan.
Now it's necessary to update the ancestors list *first*, because
set_deparse_plan consults it, and one call site got that wrong.
This error was masked in most cases because explain.c uses just one
List object for the ancestors list, updating it in-place as the plan
is scanned, so that we accidentally had the right List assigned to
dpns->ancestors before it was needed. It would fail only if a
WorkTableScan node were the first one that we tried to deparse a
subexpression of.
Per report from Markus Winand. Like the previous patch,
back-patch to v14.
Discussion: https://postgr.es/m/648B0505-AA57-42C2-A2DA-E551DE46FA15@winand.at
|
|
|
|
|
|
| |
Author: Amit Langote
Backpatch-through: 13
Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a loss of precision that occurs when the first input is
very close to 1, so that its logarithm is very small.
Formerly, during the initial low-precision calculation to estimate the
result weight, the logarithm was computed to a local rscale that was
capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be
as close as 1e-16383 to 1, hence its logarithm may be as small as
1e-16383, and so the local rscale needs to be allowed to exceed 16383,
otherwise all precision is lost, leading to a poor choice of rscale
for the full-precision calculation.
Fix this by removing the cap on the local rscale during the initial
low-precision calculation, as we already do in the full-precision
calculation. This doesn't change the fact that the initial calculation
is a low-precision approximation, computing the logarithm to around 8
significant digits, which is very fast, especially when the base is
very close to 1.
Patch by me, reviewed by Alvaro Herrera.
Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some specific logic is done at the end of recovery when involving 2PC
transactions:
1) Call RecoverPreparedTransactions(), to recover the state of 2PC
transactions into memory (re-acquire locks, etc.).
2) ShutdownRecoveryTransactionEnvironment(), to move back to normal
operations, mainly cleaning up recovery locks and KnownAssignedXids
(including any 2PC transaction tracked previously).
3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is
the tipping point for any process calling RecoveryInProgress() to check
if the cluster is still in recovery or not.
Any snapshot taken between steps 2) and 3) would be empty, causing any
transaction relying on a snapshot at this point to potentially corrupt
data as there could still be some 2PC transactions to track, with
RecentXmin moving backwards on successive calls to GetSnapshotData() in
the same transaction.
As SharedRecoveryState is the point to take into account to know if it
is safe to discard KnownAssignedXids, this commit moves step 2) after
step 3), so as we can never finish with empty snapshots.
This exists since the introduction of hot standby, so backpatch all the
way down. The window with incorrect snapshots is extremely small, but I
have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm
member fairywren. Thomas Munro also found it independently. Special
thanks to Andres Freund for taking the time to analyze this issue.
Reported-by: Thomas Munro, Michael Paquier
Analyzed-by: Andres Freund
Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to v14, we insisted that the query in RETURN QUERY be of a type
that returns tuples. (For instance, INSERT RETURNING was allowed,
but not plain INSERT.) That happened indirectly because we opened a
cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a
consequence, the error message wasn't terribly on-point, but at least
it was there.
Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY
insisted that the query be a SELECT (by checking for SPI_OK_SELECT)
while RETURN QUERY EXECUTE failed to check the query type at all.
Neither of these changes was intended.
The only convenient place to check this in the EXECUTE case is inside
_SPI_execute_plan, because we haven't done parse analysis until then.
So we need to pass down a flag saying whether to enforce that the
query returns tuples. Fortunately, we can squeeze another boolean
into struct SPIExecuteOptions without an ABI break, since there's
padding space there. (It's unlikely that any extensions would
already be using this new struct, but preserving ABI in v14 seems
like a smart idea anyway.)
Within spi.c, it seemed like _SPI_execute_plan's parameter list
was already ridiculously long, and I didn't want to make it longer.
So I thought of passing SPIExecuteOptions down as-is, allowing that
parameter list to become much shorter. This makes the patch a bit
more invasive than it might otherwise be, but it's all internal to
spi.c, so that seems fine.
Per report from Marc Bachmann. Back-patch to v14 where the
faulty code came in.
Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row. Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.
[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org
Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit ff9f111bce24 added some test code that's unportable and doesn't
add meaningful coverage. Remove it rather than try and get it to work
everywhere.
While at it, fix a typo in a log message added by the aforementioned
commit.
Backpatch to 14.
Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_variable_range() would incautiously believe that statistics
containing only an MCV list are sufficient to derive a range estimate.
That's okay for an enum-like column that contains only MCVs, but
otherwise the estimate could be pretty bad. Make it report that the
range is indeterminate unless the MCVs plus nullfrac account for
the whole table.
I don't think this needs a dedicated test case, since a quick code
coverage check verifies that the existing regression tests traverse
all the alternatives. There is room to doubt that a future-proof
test case could be built anyway, given that the submitted example
accidentally doesn't fail before v11.
Per bug #17207 from Simon Perepelitsa. Back-patch to v10.
In principle this has been broken all along, but I'm hesitant to
make such changes in 9.6, since if anyone is unhappy with 9.6.24's
behavior there will be no second chance to fix it.
Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 84f5c2908 forgot to consider the possibility that
EnsurePortalSnapshotExists could run inside a subtransaction with
lifespan shorter than the Portal's. In that case, the new active
snapshot would be popped at the end of the subtransaction, leaving
a dangling pointer in the Portal, with mayhem ensuing.
To fix, make sure the ActiveSnapshot stack entry is marked with
the same subtransaction nesting level as the associated Portal.
It's certainly safe to do so since we won't be here at all unless
the stack is empty; hence we can't create an out-of-order stack.
Let's also apply this logic in the case where PortalRunUtility
sets portalSnapshot, just to be sure that path can't cause similar
problems. It's slightly less clear that that path can't create
an out-of-order stack, so add an assertion guarding it.
Report and patch by Bertrand Drouvot (with kibitzing by me).
Back-patch to v11, like the previous commit.
Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Physical replication always ships WAL segment files to replicas once
they are complete. This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
LOG: invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be. A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary. But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.
A fix for this problem was already attempted in commit 515e3d84a0b5, but
it only addressed the case for the scenario of WAL archiving, so
streaming replication would still be a problem (as well as other things
such as taking a filesystem-level backup while the server is down after
having crashed), and it had performance scalability problems too; so it
had to be reverted.
This commit fixes the problem using an approach suggested by Andres
Freund, whereby the initial portion(s) of the split-up WAL record are
kept, and a special type of WAL record is written where the contrecord
was lost, so that WAL replay in the replica knows to skip the broken
parts. With this approach, we can continue to stream/archive segment
files as soon as they are complete, and replay of the broken records
will proceed across the crash point without a hitch.
Because a new type of WAL record is added, users should be careful to
upgrade standbys first, primaries later. Otherwise they risk the standby
being unable to start if the primary happens to write such a record.
A new TAP test that exercises this is added, but the portability of it
is yet to be seen.
This has been wrong since the introduction of physical replication, so
backpatch all the way back. In stable branches, keep the new
XLogReaderState members at the end of the struct, to avoid an ABI
break.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code inconsistently used "statistic object" or "statistics" where
the correct term, as discussed, is actually "statistics object". This
improves the state of the code to be more consistent.
While on it, fix an incorrect error message introduced in a4d75c8. This
error should never happen, as the code states, but it would be
misleading.
Author: Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calculating degree of a functional dependency may allocate a lot of
memory - we have released mot of the explicitly allocated memory, but
e.g. detoasted varlena values were left behind. That may be an issue,
because we consider a lot of dependencies (all combinations), and the
detoasting may happen for each one again.
Fixed by calling dependency_degree() in a dedicated context, and
resetting it after each call. We only need the calculated dependency
degree, so we don't need to copy anything.
Backpatch to PostgreSQL 10, where extended statistics were introduced.
Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, all extended statistics on a given relation were built in the
same memory context, without resetting. Some of the memory was released
explicitly, but not all of it - for example memory allocated while
detoasting values is hard to free. This is how it worked since extended
statistics were introduced in PostgreSQL 10, but adding support for
extended stats on expressions made the issue somewhat worse as it
increases the number of statistics to build.
Fixed by adding a memory context which gets reset after building each
statistics object (all the statistics kinds included in it). Resetting
it after building each statistics kind would be even better, but it
would require more invasive changes and copying of results, making it
harder to backpatch.
Backpatch to PostgreSQL 10, where extended statistics were introduced.
Author: Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.
Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is not appropriate for deduplication to apply single value strategy
when triggered by a bottom-up index deletion pass. This wastes cycles
because later bottom-up deletion passes will overinterpret older
duplicate tuples that deduplication actually just skipped over "by
design". It also makes bottom-up deletion much less effective for low
cardinality indexes that happen to cross a meaningless "index has single
key value per leaf page" threshold.
To fix, slightly narrow the conditions under which deduplication's
single value strategy is considered. We already avoided the strategy
for a unique index, since our high level goal must just be to buy time
for VACUUM to run (not to buy space). We'll now also avoid it when we
just had a bottom-up pass that reported failure. The two cases share
the same high level goal, and already overlapped significantly, so this
approach is quite natural.
Oversight in commit d168b666, which added bottom-up index deletion.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznaOvM+Gyj-JQ0X=JxoMDxctDTYjiEuETdAGbF5EUc3MA@mail.gmail.com
Backpatch: 14-, where bottom-up deletion was introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot. I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement. That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list. As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.
(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem. There are no STABLE procedures.)
We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself. Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.
Per bug #17199 from Alexander Nawratil. Back-patch to v11,
like the previous patch.
Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed that commit 0bead9af484c left this flag undocumented in
XLogSetRecordFlags, which led me to discover that the flag doesn't
actually do what the one comment on it said it does. Improve the
situation by adding some more comments.
Backpatch to 14, where the aforementioned commit appears.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202109212119.c3nhfp64t2ql@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A broken HOT chain is not an unexpected condition, even when the offset
number points past the end of the page's line pointer array.
heap_prune_chain() does not (and never has) treated this condition as
unexpected, so derivative code in heap_index_delete_tuples() shouldn't
do so either.
Oversight in commit 4228817449.
The assertion can probably only fail on Postgres 14 and master. Earlier
releases don't have commit 3c3b8a4b, which taught VACUUM to truncate the
line pointer array of heap pages. Backpatch all the same, just to be
consistent.
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org
Backpatch: 12-, just like commit 4228817449.
|
|
|
|
|
| |
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 10b675b81a3a04bac460cb049e0b7b6e17fb4795
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since introduction of extended statistics, we've disallowed references
to system columns. So for example
CREATE STATISTICS s ON ctid FROM t;
would fail. But with extended statistics on expressions, it was possible
to work around this limitation quite easily
CREATE STATISTICS s ON (ctid::text) FROM t;
This is an oversight in a4d75c86bf, fixed by adding a simple check.
Backpatch to PostgreSQL 14, where support for extended statistics on
expressions was introduced.
Backpatch-through: 14
Discussion: https://postgr.es/m/20210816013255.GS10479%40telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 55dc86eca changed pull_varnos to use (if possible) the associated
ph_eval_at for a PlaceHolderVar. I missed a fine point though: we might
be looking at a PHV in the quals or tlist of a child appendrel, in which
case we need to compute a ph_eval_at value that's been translated in the
same way that the PHV itself has been (cf. adjust_appendrel_attrs).
Fortunately, enough info is available in the PlaceHolderInfo to make
such translation possible without additional outside data, so we don't
need another round of uglification of planner APIs. This is a little
bit complicated, but since it's a hard-to-hit corner case, I'm not much
worried about adding cycles here.
Per report from Jaime Casanova. Back-patch to v12, like the previous
commit.
Discussion: https://postgr.es/m/20210915230959.GB17635@ahch-to
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rewriter transformation for SEARCH BREADTH FIRST produces a
FieldSelect on a Var of type RECORD, where the Var references the
recursive union's worktable output. EXPLAIN VERBOSE failed to handle
this case, because it only expected such Vars to appear in CteScans
not WorkTableScans. Fix that, and add some test cases exercising
EXPLAIN on SEARCH and CYCLE queries.
In principle this oversight is an old bug, but it seems that the
case is unreachable without SEARCH BREADTH FIRST, because the
parser fails when attempting to create such a reference manually.
So for today I'll just patch HEAD/v14. Someday we might find that
the code portion of this patch needs to be back-patched further.
Per report from Atsushi Torikoshi.
Discussion: https://postgr.es/m/5bafa66ad529e11860339565c9e7c166@oss.nttdata.com
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Session statistics, as introduced by 960869da08, had several shortcomings:
- an additional GetCurrentTimestamp() call that also impaired the accuracy of
the data collected
This can be avoided by passing the current timestamp we already have in
pgstat_report_stat().
- an additional statistics UDP packet sent every 500ms
This is solved by adding the new statistics to PgStat_MsgTabstat.
This is conceptually ugly, because session statistics are not
table statistics. But the struct already contains data unrelated
to tables, so there is not much damage done.
Connection and disconnection are reported in separate messages, which
reduces the number of additional messages to two messages per session and a
slight increase in PgStat_MsgTabstat size (but the same number of table
stats fit).
- Session time computation could overflow on systems where long is 32 bit.
Reported-By: Andres Freund <andres@anarazel.de>
Author: Andres Freund <andres@anarazel.de>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/20210801205501.nyxzxoelqoo4x2qc%40alap3.anarazel.de
Backpatch: 14-, where the feature was introduced.
|