| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
| |
The code for initializing the tapes on each merge iteration was skipped
in a parallel worker. I put the !WORKER(state) check in wrong place while
rebasing the patch.
That caused failures in the index build in 'multiple-row-versions'
isolation test, in multiple buildfarm members. On my laptop it was easier
to reproduce by building an index on a larger table, so that you got a
parallel sort more reliably.
|
|
|
|
| |
To make buildfarm member locust happy.
|
|
|
|
|
|
|
|
| |
The previous commit 65014000b3 changed the variable passed to elog
from an int64 to a size_t variable, but neglected to change the modifier
in the format string accordingly.
Per failure on buildfarm member lapwing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The advantage of polyphase merge is that it can reuse the input tapes as
output tapes efficiently, but that is irrelevant on modern hardware, when
we can easily emulate any number of tape drives. The number of input tapes
we can/should use during merging is limited by work_mem, but output tapes
that we are not currently writing to only cost a little bit of memory, so
there is no need to skimp on them.
This makes sorts that need multiple merge passes faster.
Discussion: https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi
Reviewed-by: Peter Geoghegan, Zhihong Yu, John Naylor
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the tape functions, like LogicalTapeRead and LogicalTapeWrite, now
take a LogicalTape as argument, instead of LogicalTapeSet+tape number.
You can create any number of LogicalTapes in a single LogicalTapeSet, and
you don't need to decide the number upfront, when you create the tape set.
This makes the tape management in hash agg spilling in nodeAgg.c simpler.
Discussion: https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi
Reviewed-by: Peter Geoghegan, Zhihong Yu, John Naylor
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a replication slot creation, an ERROR generated in the same
transaction as the one creating a to-be-exported snapshot would have
left the backend in an inconsistent state, as the associated static
export snapshot state was not being reset on transaction abort, but only
on the follow-up command received by the WAL sender that created this
snapshot on replication slot creation. This would trigger inconsistency
failures if this session tried to export again a snapshot, like during
the creation of a replication slot.
Note that a snapshot export cannot happen in a transaction block, so
there is no need to worry resetting this state for subtransaction
aborts. Also, this inconsistent state would very unlikely show up to
users. For example, one case where this could happen is an
out-of-memory error when building the initial snapshot to-be-exported.
Dilip found this problem while poking at a different patch, that caused
an error in this code path for reasons unrelated to HEAD.
Author: Dilip Kumar
Reviewed-by: Michael Paquier, Zhihong Yu
Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com
Backpatch-through: 9.6
|
|
|
|
| |
Follow up to commit 2903f140.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not update shm_mq's mq_bytes_written until we have written
an amount of data greater than 1/4th of the ring size, unless
the caller of shm_mq_send(v) requests a flush at the end of
the message. This reduces the number of calls to SetLatch(),
and also the number of CPU cache misses, considerably, and thus
makes shm_mq significantly faster.
Dilip Kumar, reviewed by Zhihong Yu and Tomas Vondra. Some
minor cosmetic changes by me.
Discussion: http://postgr.es/m/CAFiTN-tVXqn_OG7tHNeSkBbN+iiCZTiQ83uakax43y1sQb2OBA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
An extension may want to call GetSecurityLabel() on a shared object
before the shared relcaches are fully initialized. For instance, a
ClientAuthentication_hook might want to retrieve the security label on
a role.
Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a function-in-FROM laterally references the output of some sub-SELECT
earlier in the FROM clause, and we are able to flatten that sub-SELECT
into the outer query, the expression(s) copied into the function RTE
missed being processed by eval_const_expressions. This'd lead to trouble
and probable crashes at execution if such expressions contained
named-argument function call syntax or functions with defaulted arguments.
The bug is masked if the query contains any explicit JOIN syntax, which
may help explain why we'd not noticed.
Per bug #17227 from Bernd Dorn. This is an oversight in commit 7266d0997,
so back-patch to v13 where that came in.
Discussion: https://postgr.es/m/17227-5a28ed1512189fa4@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CreateOverwriteContrecordRecord(), UpdateFullPageWrites(),
PerformRecoveryXLogAction(), and CleanupAfterArchiveRecovery()
are moved somewhat later in StartupXLOG(). This is preparatory
work for a future patch that wants to allow recovery to end at one
time and only later start to allow WAL writes. To do that, it's
necessary to separate code that has to do with allowing WAL writes
from other things that need to happen simply because recovery is
ending, such as initializing shared memory data structures that
depend on information that might not be accurate before redo is
complete.
This commit does not achieve that goal, but it is a step in that
direction. For example, there are a few different bits of code that
write things into WAL once we have finished recovery, and with this
change, those bits of code are closer to each other than previously,
with fewer unrelated bits of code interspersed.
Robert Haas and Amul Sul
Discussion: http://postgr.es/m/CAAJ_b97abMuq=470Wahun=aS1PHTSbStHtrjjPaD-C0YQ1AqVw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create a new function PerformRecoveryXLogAction() and move the
code which either writes an end-of-recovery record or requests a
checkpoint there.
Also create a new function CleanupAfterArchiveRecovery() to
perform a few tasks that we want to do after we've actually exited
archive recovery but before we start accepting new WAL writes.
More refactoring of this file is planned, but this commit is
just straightforward code movement to make StartupXLOG() a
little bit shorter and a little bit easier to understand.
Robert Haas and Amul Sul
Discussion: http://postgr.es/m/CAAJ_b97abMuq=470Wahun=aS1PHTSbStHtrjjPaD-C0YQ1AqVw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code was freeing the name of the multirange type function stored in
the parse tree but it should not do that. Event triggers could for
example look at such a corrupted parsed tree with a ddl_command_end
event.
Author: Alex Kozhemyakin, Sergey Shinderuk
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/d5042d46-b9cd-6efb-219a-71ed0cf45bc8@postgrespro.ru
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes, we replace a symbolic link that we find in the data
directory with an actual directory within the tarfile that we
create. _tarWriteDir was responsible both for making this
substitution and also for writing the tar header for the
resulting directory into the tar file. Make it do only the first
of those things, and rename to convert_link_to_directory.
Substantially larger refactoring of this source file is planned,
but this little bit seemed to make sense to commit
independently.
Discussion: http://postgr.es/m/CA+Tgmobz6tuv5tr-WxURe5JA1vVcGz85k4kkvoWxcyHvDpEqFA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously when pg_log_backend_memory_contexts() sent the request to
the autovacuum launcher, it could take more than several seconds to
log its memory contexts. Because the function (HandleAutoVacLauncherInterrupts)
to process any new interrupts that autovacuum launcher received
didn't handle the request for logging of memory contexts. This commit changes
the function so that it handles the request, to make autovacuum launcher
more responsitve to pg_log_backend_memory_contexts().
Back-patch to v14 where pg_log_backend_memory_contexts() was added.
Author: Koyu Tanigawa
Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
Discussion: https://postgr.es/m/0aae3e074face409b35153451be5cc11@oss.nttdata.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3f50b8263 had an oversight: formerly, to deparse expressions
attached to a plan node, it was only necessary to update the
deparse_namespace ancestors list alongside calling set_deparse_plan.
Now it's necessary to update the ancestors list *first*, because
set_deparse_plan consults it, and one call site got that wrong.
This error was masked in most cases because explain.c uses just one
List object for the ancestors list, updating it in-place as the plan
is scanned, so that we accidentally had the right List assigned to
dpns->ancestors before it was needed. It would fail only if a
WorkTableScan node were the first one that we tried to deparse a
subexpression of.
Per report from Markus Winand. Like the previous patch,
back-patch to v14.
Discussion: https://postgr.es/m/648B0505-AA57-42C2-A2DA-E551DE46FA15@winand.at
|
|
|
|
|
|
|
|
|
| |
This is similar to fd0625c, taking care of any remaining code paths that
are worth the cleanup. This also changes some cases using opposite
expression patterns.
Author: Justin Pryzby, Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoCdF8dnUvr-BUWWGvA_XhKSoANacBMZb6jKyCk4TYfQ2Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
send_message_to_server_log() would force a redirection of a log entry to
stderr in some cases for csvlog, like the syslogger not being available
yet. If this happens, csvlog would fall back to stderr to log
some information rather than nothing. The code was organized so as
stderr is done before csvlog, with csvlog checking that stderr did not
happen yet with a reversed condition. With this code organization, it
could be possible to lose some messages if running Postgres as a service
on WIN32, as there is no usable stderr, and the handling of the
StringInfoData holding the message for stderr was rather confusing
because of that.
This commit moves the csvlog handling to be before stderr, as as we are
able to track down if it is necessary to log something to stderr. The
reduces the handling of stderr to be in a single code path, adding a
fallback to event logs for a WIN32 service. This also simplifies the
way we handle the StringInfoData for stderr, making easier the
integration of new file-based log destinations. I got to play with
services and event logs on Windows while checking this change.
Reviewed-by: Chris Bandy
Discussion: https://postgr.es/m/YV0vwBovEKf1WXkl@paquier.xyz
|
|
|
|
|
|
| |
Author: Amit Langote
Backpatch-through: 13
Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com
|
|
|
|
|
|
|
| |
Oversight in 5c6e33f.
Author: Nathan Bossart
Discussion: https://postgr.es/m/DD8AD4CE-63B7-44BE-A3D2-14A4E4B19C26@amazon.com
|
|
|
|
|
| |
Move support functions for new PublicationTable node to more sensible
locations in the files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
stderr and csvlog have been using duplicated code when it came to the
rotation of their file by size, age or if forced by a user request
(pg_ctl logrotate or the SQL function pg_rotate_logfile). The main
difference between both is that stderr requires its file to always be
opened, so as it is possible to have a redirection route if the logging
collector is not ready yet to do its work if alternate destinations are
enabled.
Also, if csvlog gets disabled, we need to close properly its meta-data
stored in the logging collector (last file name for current_logfiles and
fd currently open for business). Except for those points, the code is
the same in terms of error handling and if a file should be created or
just continued.
This change makes the code simpler overall, and it will help in the
introduction of more file-based log destinations. This refactoring is
similar to the work done in 5b0b699. Most of the duplication originates
from fd801f4.
Some of the TAP tests of pg_ctl check the case of a forced log rotation,
but this is somewhat limited as there is no coverage for
log_rotation_age or log_rotation_size (these may not be worth the extra
resources to run either), and no coverage for reload of log_destination
with different combinations of stderr and csvlog. I have tested all
those cases separately for this refactoring.
Author: Michael Paquier
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a loss of precision that occurs when the first input is
very close to 1, so that its logarithm is very small.
Formerly, during the initial low-precision calculation to estimate the
result weight, the logarithm was computed to a local rscale that was
capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be
as close as 1e-16383 to 1, hence its logarithm may be as small as
1e-16383, and so the local rscale needs to be allowed to exceed 16383,
otherwise all precision is lost, leading to a poor choice of rscale
for the full-precision calculation.
Fix this by removing the cap on the local rscale during the initial
low-precision calculation, as we already do in the full-precision
calculation. This doesn't change the fact that the initial calculation
is a low-precision approximation, computing the logarithm to around 8
significant digits, which is very fast, especially when the base is
very close to 1.
Patch by me, reviewed by Alvaro Herrera.
Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like BASE_BACKUP, CREATE_REPLICATION_SLOT has historically used a
hard-coded syntax. To improve future extensibility, adopt a flexible
options syntax here, too.
In the new syntax, instead of three mutually exclusive options
EXPORT_SNAPSHOT, USE_SNAPSHOT, and NOEXPORT_SNAPSHOT, there is now a single
SNAPSHOT option with three possible values: 'export', 'use', and 'nothing'.
This commit does not remove support for the old syntax. It just adds
the new one as an additional option, makes pg_receivewal,
pg_recvlogical, and walreceiver processes use it.
Patch by me, reviewed by Fabien Coelho, Sergei Kornilov, and
Fujii Masao.
Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, BASE_BACKUP used an entirely hard-coded syntax, but that's
hard to extend. Instead, adopt the same kind of syntax we've used for
SQL commands such as VACUUM, ANALYZE, COPY, and EXPLAIN, where it's
not necessary for all of the option names to be parser keywords.
In the new syntax, most of the options now take an optional Boolean
argument. To match our practice in other in places, the options which
the old syntax called NOWAIT and NOVERIFY_CHECKSUMS options are in the
new syntax called WAIT and VERIFY_CHECKUMS, and the default value is
false. In the new syntax, the FAST option has been replaced by a
CHECKSUM option whose value may be 'fast' or 'spread'.
This commit does not remove support for the old syntax. It just adds
the new one as an additional option, and makes pg_basebackup prefer
the new syntax when the server is new enough to support it.
Patch by me, reviewed and tested by Fabien Coelho, Sergei Kornilov,
Fujii Masao, and Tushar Ahuja.
Discussion: http://postgr.es/m/CA+TgmobAczXDRO_Gr2euo_TxgzaH1JxbNxvFx=HYvBinefNH8Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZGwR=ZVWFeecncubEyPdwghnvfkkdBe9BLccLSiqdf9Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 0668719801 changed XLogPageRead() so that it validated the page
header, if invalid page header was found reset the error message and
retried reading the page, to fix the scenario where streaming standby
got stuck at a continuation record. This change hid the error message
about invalid page header, which would make it harder for users to
investigate what the actual issue was found in WAL.
To fix the issue, this commit makes XLogPageRead() report the error
message when invalid page header is found.
When not in standby mode, an invalid page header should cause recovery
to end, not retry reading the page, so XLogPageRead() doesn't need to
validate the page header for the retry. Instead, ReadPageInternal() should
be responsible for the validation in that case. Therefore this commit
changes XLogPageRead() so that if not in standby mode it doesn't validate
the page header for the retry.
Reported-by: Yugo Nagata
Author: Yugo Nagata, Kyotaro Horiguchi
Reviewed-by: Ranier Vilela, Fujii Masao
Discussion: https://postgr.es/m/20210718045505.32f463ed6c227111038d8ae4@sraoss.co.jp
|
|
|
|
|
|
|
|
|
| |
Commits 955a684e04 and a975ff4980 removed the usage of running xacts
information from serialized snapshots but forgot to remove the
corresponding comment.
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBifOr7RS=jRe7YCavc646y9omChv6zkWXvJeZcjS9mXA@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Fix the rules so that each rule is parallel safe, using the same
trickery that we use elsewhere in the tree for rules that produce more
than one output file. Refactor the whole makefile so that there is
less repetition.
Discussion: https://www.postgresql.org/message-id/18e34084-aab1-1b4c-edd1-c4f9fb04f714%40enterprisedb.com
|
|
|
|
|
|
|
| |
Remove accidentally duplicated words in code comments.
Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/87bl45t0co.fsf@wibble.ilmari.org
|
|
|
|
|
|
|
| |
A couple of newer ones are available. There are no functional
differences, but let's get them in anyway, so that there is no
surprise diff next time someone wants to do some actual work in this
area.
|
|
|
|
|
|
|
|
|
| |
While Xid is a known shortening of TransactionId, InvalidXid is not
defined in the code. Fix comments which mistakenly were using the
shorter version.
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACUQzdigML868nV4cojfELPkEzNLNOk7b91Pho4JB90fng@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some specific logic is done at the end of recovery when involving 2PC
transactions:
1) Call RecoverPreparedTransactions(), to recover the state of 2PC
transactions into memory (re-acquire locks, etc.).
2) ShutdownRecoveryTransactionEnvironment(), to move back to normal
operations, mainly cleaning up recovery locks and KnownAssignedXids
(including any 2PC transaction tracked previously).
3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is
the tipping point for any process calling RecoveryInProgress() to check
if the cluster is still in recovery or not.
Any snapshot taken between steps 2) and 3) would be empty, causing any
transaction relying on a snapshot at this point to potentially corrupt
data as there could still be some 2PC transactions to track, with
RecentXmin moving backwards on successive calls to GetSnapshotData() in
the same transaction.
As SharedRecoveryState is the point to take into account to know if it
is safe to discard KnownAssignedXids, this commit moves step 2) after
step 3), so as we can never finish with empty snapshots.
This exists since the introduction of hot standby, so backpatch all the
way down. The window with incorrect snapshots is extremely small, but I
have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm
member fairywren. Thomas Munro also found it independently. Special
thanks to Andres Freund for taking the time to analyze this issue.
Reported-by: Thomas Munro, Michael Paquier
Analyzed-by: Andres Freund
Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to v14, we insisted that the query in RETURN QUERY be of a type
that returns tuples. (For instance, INSERT RETURNING was allowed,
but not plain INSERT.) That happened indirectly because we opened a
cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a
consequence, the error message wasn't terribly on-point, but at least
it was there.
Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY
insisted that the query be a SELECT (by checking for SPI_OK_SELECT)
while RETURN QUERY EXECUTE failed to check the query type at all.
Neither of these changes was intended.
The only convenient place to check this in the EXECUTE case is inside
_SPI_execute_plan, because we haven't done parse analysis until then.
So we need to pass down a flag saying whether to enforce that the
query returns tuples. Fortunately, we can squeeze another boolean
into struct SPIExecuteOptions without an ABI break, since there's
padding space there. (It's unlikely that any extensions would
already be using this new struct, but preserving ABI in v14 seems
like a smart idea anyway.)
Within spi.c, it seemed like _SPI_execute_plan's parameter list
was already ridiculously long, and I didn't want to make it longer.
So I thought of passing SPIExecuteOptions down as-is, allowing that
parameter list to become much shorter. This makes the patch a bit
more invasive than it might otherwise be, but it's all internal to
spi.c, so that seems fine.
Per report from Marc Bachmann. Back-patch to v14 where the
faulty code came in.
Discussion: https://postgr.es/m/1F2F75F0-27DF-406F-848D-8B50C7EEF06A@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "equality implies image equality" opclass infrastructure disallowed
deduplication in system catalog indexes and TOAST indexes before now.
That seemed like the right approach back when the infrastructure was
added by commit 612a1ab7, since ALTER INDEX cannot set deduplicate_items
to 'off' (due to an old implementation restriction). But that decision
now seems arbitrary at best. Remove special case handling implementing
this policy.
No catversion bump, since existing catalog indexes will still work.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=rYQHFaJ3WYBdK=xgwxKzaiGMSSrh-ZCREa-pS-7Zjew@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both bugs #16676[1] and #17141[2] illustrate that the combination of
SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes
to rows returned to other sessions accessing the same row. Since this
situation is detectable from the syntax and hard to fix otherwise,
forbid for now, with the potential to fix in the future.
[1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org
[2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org
Backpatch-through: 13, where WITH TIES was introduced
Author: David Christensen <david.christensen@crunchydata.com>
Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit ff9f111bce24 added some test code that's unportable and doesn't
add meaningful coverage. Remove it rather than try and get it to work
everywhere.
While at it, fix a typo in a log message added by the aforementioned
commit.
Backpatch to 14.
Discussion: https://postgr.es/m/3000074.1632947632@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_variable_range() would incautiously believe that statistics
containing only an MCV list are sufficient to derive a range estimate.
That's okay for an enum-like column that contains only MCVs, but
otherwise the estimate could be pretty bad. Make it report that the
range is indeterminate unless the MCVs plus nullfrac account for
the whole table.
I don't think this needs a dedicated test case, since a quick code
coverage check verifies that the existing regression tests traverse
all the alternatives. There is room to doubt that a future-proof
test case could be built anyway, given that the submitted example
accidentally doesn't fail before v11.
Per bug #17207 from Simon Perepelitsa. Back-patch to v10.
In principle this has been broken all along, but I'm hesitant to
make such changes in 9.6, since if anyone is unhappy with 9.6.24's
behavior there will be no second chance to fix it.
Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 84f5c2908 forgot to consider the possibility that
EnsurePortalSnapshotExists could run inside a subtransaction with
lifespan shorter than the Portal's. In that case, the new active
snapshot would be popped at the end of the subtransaction, leaving
a dangling pointer in the Portal, with mayhem ensuing.
To fix, make sure the ActiveSnapshot stack entry is marked with
the same subtransaction nesting level as the associated Portal.
It's certainly safe to do so since we won't be here at all unless
the stack is empty; hence we can't create an out-of-order stack.
Let's also apply this logic in the case where PortalRunUtility
sets portalSnapshot, just to be sure that path can't cause similar
problems. It's slightly less clear that that path can't create
an out-of-order stack, so add an assertion guarding it.
Report and patch by Bertrand Drouvot (with kibitzing by me).
Back-patch to v11, like the previous commit.
Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This field was recently added in db632fbca, however that commit missed one
place where it should have initialized the new field to NULL. The missed
location is where the PartitionBoundInfo is created for partition-wise
join relations. Technically there could be interleaved partitions in a
partition-wise join relation, but currently the only optimization we use
this field for only does so for base rels and other member rels. So just
document that we don't populate this field for join rels.
Reported-by: Amit Langote
Author: Amit Langote, David Rowley
Reviewed-by: Amit Langote, David Rowley
Discussion: https://postgr.es/m/CA+HiwqE76Rps24kwHsd2Cr82Ua07tJC9t9reG0c7ScX9n_xrEA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS' list of "errnos that
identify hard failure of a previously-established network connection".
While one could imagine that this is sometimes recoverable, the same
could be said of other entries such as ENETDOWN.
In support of this, handle ETIMEDOUT on par with other socket errors
in relevant infrastructure, such as TranslateSocketError().
(I made a couple of cosmetic adjustments in TranslateSocketError(),
too.) The code now assumes that ETIMEDOUT is defined everywhere,
which it should be given that POSIX has required it since SUSv2.
Perhaps this should be back-patched, but I'm hesitant to do so given
the lack of previous complaints, and the hazard that there's a small
ABI break on Windows from redefining the symbol. Even if we decide
to do that, it'd be prudent to let this bake awhile in HEAD first.
Jelte Fennema
Discussion: https://postgr.es/m/AM5PR83MB01782BFF2978505F6D6C559AF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Physical replication always ships WAL segment files to replicas once
they are complete. This is a problem if one WAL record is split across
a segment boundary and the primary server crashes before writing down
the segment with the next portion of the WAL record: WAL writing after
crash recovery would happily resume at the point where the broken record
started, overwriting that record ... but any standby or backup may have
already received a copy of that segment, and they are not rewinding.
This causes standbys to stop following the primary after the latter
crashes:
LOG: invalid contrecord length 7262 at A8/D9FFFBC8
because the standby is still trying to read the continuation record
(contrecord) for the original long WAL record, but it is not there and
it will never be. A workaround is to stop the replica, delete the WAL
file, and restart it -- at which point a fresh copy is brought over from
the primary. But that's pretty labor intensive, and I bet many users
would just give up and re-clone the standby instead.
A fix for this problem was already attempted in commit 515e3d84a0b5, but
it only addressed the case for the scenario of WAL archiving, so
streaming replication would still be a problem (as well as other things
such as taking a filesystem-level backup while the server is down after
having crashed), and it had performance scalability problems too; so it
had to be reverted.
This commit fixes the problem using an approach suggested by Andres
Freund, whereby the initial portion(s) of the split-up WAL record are
kept, and a special type of WAL record is written where the contrecord
was lost, so that WAL replay in the replica knows to skip the broken
parts. With this approach, we can continue to stream/archive segment
files as soon as they are complete, and replay of the broken records
will proceed across the crash point without a hitch.
Because a new type of WAL record is added, users should be careful to
upgrade standbys first, primaries later. Otherwise they risk the standby
being unable to start if the primary happens to write such a record.
A new TAP test that exercises this is added, but the portability of it
is yet to be seen.
This has been wrong since the introduction of physical replication, so
backpatch all the way back. In stable branches, keep the new
XLogReaderState members at the end of the struct, to avoid an ABI
break.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code inconsistently used "statistic object" or "statistics" where
the correct term, as discussed, is actually "statistics object". This
improves the state of the code to be more consistent.
While on it, fix an incorrect error message introduced in a4d75c8. This
error should never happen, as the code states, but it would be
misleading.
Author: Justin Pryzby
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
Backpatch-through: 14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A forked logging collector in EXEC_BACKEND builds passes down file
descriptors (or HANDLEs in WIN32) through a command for files to be
reopened (for stderr and csvlog). Some of its logic was duplicated, and
this commit refactors the code with some wrapper routines for file
reopening after forking and fd grabbing when building the command for
the fork.
While on it, this simplifies a use of "long" in the code, introduced by
ab0ba6e to take care of a warning related to MinGW-W64 when mapping a
intptr_t to a printed value. "long" is 32-bit long on Windows, and
interoperability of Win32 and Win64 ensures that handles are always
32-bit significant, so we can just use "int" for the same result. This
also makes the new routines more symmetric.
This change makes easier the introduction of new log destinations in the
logging collector, and this is not the only piece of refactoring
planned. I have tested this change with EXEC_BACKEND on linux, macos,
and of course MSVC (both Win32 and Win64), but not MinGW so the
buildfarm may have something to say here.
Author: Sehrope Sarkuni, Michael Paquier
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Several mistakes have piled in the code comments over the time,
including incorrect grammar, function names and simple typos. This
commit takes care of a portion of these.
No backpatch is done as this is only cosmetic.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20210924215827.GS831@telsasoft.com
|
|
|
|
|
|
|
|
|
|
| |
Discussing the low level issue of nbtree VACUUM and recovery conflicts
in btvacuumpage() now seems inappropriate. The same issue is discussed
in nbtxlog.h, as well as in a comment block above _bt_delitems_vacuum().
The comment block made more sense when it was part of a broader
discussion of nbtree VACUUM "pin scans". These were removed by commit
9f83468b.
|
|
|
|
|
|
|
|
|
| |
Only done on the master branch for now to fix build farm animal seawasp
(which tests bleeeding edge PostgreSQL with bleeding edge LLVM). We can
back-patch a consolidated fix closer to LLVM 14's release, once its API
has stopped moving around.
Discussion: https://postgr.es/m/CA%2BhUKGL%3Dyg6qqgg6W6SAuvRQejditeoDNy-X3b9H_6Fnw8j5Wg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Splitting the time field into days and microseconds is pretty
useless when we're just going to recombine those values.
It's unclear if anyone will notice the speedup in real-world
cases, but a cycle shaved is a cycle earned.
Discussion: https://postgr.es/m/2629129.1632675713@sss.pgh.pa.us
|
|
|
|
|
|
|
|
| |
_bt_delitems_delete() is no longer the high-level entry point used by
index tuple deletion driven by index tuples whose LP_DEAD bits are set
(now called "simple index tuple deletion"). It became a lower level
routine that's only called by _bt_delitems_delete_check() following
commit d168b66682.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove obsolete reference to lazy_vacuum()'s onecall argument. The
function argument was removed by commit 3499df0dee.
Also remove adjoining comment block that introduces the wraparound
failsafe concept. Talking about the failsafe here no longer makes
sense, since lazy_vacuum() (and related functions) are no longer the
only place where the failsafe might be triggered. This has been the
case since commit c242baa4a8 taught VACUUM to consider triggering the
failsafe mechanism during its initial heap scan.
|
|
|
|
|
|
|
|
|
|
|
| |
Point out that index tuple deletion generally needs a latestRemovedXid
value for the deletion operation's WAL record. This is bound to be the
most expensive part of the whole deletion operation now that it takes
place up front, during original execution.
This was arguably an oversight in commit 558a9165e08, which moved the
work required to generate these values from index deletion REDO routines
to original execution of index deletion operations.
|