| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
entryGetItem()'s three code paths each contained bugs associated
with filtering the entries for gin_fuzzy_search_limit.
The posting-tree path failed to advance "advancePast" after having
decided to filter an item. If we ran out of items on the current
page and needed to advance to the next, what would actually happen
is that entryLoadMoreItems() would re-load the same page. Eventually,
the random dropItem() test would accept one of the same items it'd
previously rejected, and we'd move on --- but it could take awhile
with small gin_fuzzy_search_limit. To add insult to injury, this
case would inevitably cause entryLoadMoreItems() to decide it needed
to re-descend from the root, making things even slower.
The posting-list path failed to implement gin_fuzzy_search_limit
filtering at all, so that all entries in the posting list would
be returned.
The bitmap-result path used a "gotitem" variable that it failed to
update in the one place where it'd actually make a difference, ie
at the one "continue" statement. I think this was unreachable in
practice, because if we'd looped around then it shouldn't be the
case that the entries on the new page are before advancePast.
Still, the "gotitem" variable was contributing nothing to either
clarity or correctness, so get rid of it.
Refactor all three loops so that the termination conditions are
more alike and less unreadable.
The code coverage report showed that we had no coverage at all for
the re-descend-from-root code path in entryLoadMoreItems(), which
seems like a very bad thing, so add a test case that exercises it.
We also had exactly no coverage for gin_fuzzy_search_limit, so add a
simplistic test case that at least hits those code paths a little bit.
Back-patch to all supported branches.
Adé Heyward and Tom Lane
Discussion: https://postgr.es/m/CAEknJCdS-dE1Heddptm7ay2xTbSeADbkaQ8bU2AXRCVC2LdtKQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2aa6e33, that added a fast path to skip
anti-wraparound and non-aggressive autovacuum jobs (these have no sense
as anti-wraparound implies aggressive). With a cluster using a high
amount of relations with a portion of them being heavily updated, this
could cause autovacuum to lock down, with autovacuum workers attempting
repeatedly those jobs on the same relations for the same database, that
just kept being skipped. This lock down can be solved with a manual
VACUUM FREEZE.
Justin King has reported one environment where the issue happened, and
Julien Rouhaud and I have been able to reproduce it in a second
environment. With a very aggressive autovacuum_freeze_max_age,
triggering those jobs with pgbench is a matter of minutes, and hitting
the lock down is a lot harder (my local tests failed to do that).
Note that anti-wraparound and non-aggressive jobs can only be triggered
on a subset of shared catalogs:
- pg_auth_members
- pg_authid
- pg_database
- pg_replication_origin
- pg_shseclabel
- pg_subscription
- pg_tablespace
While the lock down was possible down to v12, the root cause of those
jobs is a much older issue, which needs more analysis.
Bonus thanks to Andres Freund for the discussion.
Reported-by: Justin King
Discussion: https://postgr.es/m/CAE39h22zPLrkH17GrkDgAYL3kbjvySYD1io+rtnAUFnaJJVS4g@mail.gmail.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
INCLUDE indexes failed to have their non-key attributes physically
truncated away in certain rare cases. This led to physically larger
pivot tuples that contained useless non-key attribute values. The
impact on users should be negligible, but this is still clearly a
regression (Postgres 11 supports INCLUDE indexes, and yet was not
affected).
The bug appeared in commit dd299df8, which introduced "true" suffix
truncation of key attributes.
Discussion: https://postgr.es/m/CAH2-Wz=E8pkV9ivRSFHtv812H5ckf8s1-yhx61_WrJbKccGcrQ@mail.gmail.com
Backpatch: 12-, where "true" suffix truncation was introduced.
|
|
|
|
|
|
|
|
| |
This reverts commit cb2fd7eac285b1b0a24eeb2b8ed4456b66c5a09f. Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64. Back-patch to 9.5 (all supported versions).
Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, only selected bulk operations (e.g. COPY) did this. If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY. See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules. Maintainers of table access
methods should examine that section.
To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL. A new GUC,
wal_skip_threshold, guides that choice. If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold. Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.
Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node. Make relcache.c retain entries for certain
dropped relations until end of transaction.
Back-patch to 9.5 (all supported versions). This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As
always, update standby systems before master systems. This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions. (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)
Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas. Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem. Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout.
Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
|
|
|
|
|
|
|
|
|
| |
The function assumed forkNum=MAIN_FORKNUM and page_std=true, ignoring
the actual arguments. Existing callers passed exactly those values, so
there's no live bug. Back-patch to v12, where the function first
appeared, because another fix needs this.
Discussion: https://postgr.es/m/20191118045434.GA1173436@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
swap_relation_files() calls toast_get_valid_index() to find and lock
this index, just before swapping with the rebuilt TOAST index. The
latter function releases the lock before returning. Potential for
mischief is low; a concurrent session can issue ALTER INDEX ... SET
(fillfactor = ...), which is not alarming. Nonetheless, changing
pg_class.relfilenode without a lock is unconventional. Back-patch to
9.5 (all supported versions), because another fix needs this.
Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.
This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.
This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.
Back-patch to all supported branches.
Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VACUUM may truncate heap in several batches. The activity report
is logged for each batch, and contains the number of pages in the table
before and after the truncation, and also the elapsed time during
the truncation. Previously the elapsed time reported in each batch was
the total elapsed time since starting the truncation until finishing
each batch. For example, if the truncation was processed dividing into
three batches, the second batch reported the accumulated time elapsed
during both first and second batches. This is strange and confusing
because the number of pages in the table reported together is not
total. Instead, each batch should report the time elapsed during
only that batch.
The cause of this issue was that the resource usage snapshot was
initialized only at the beginning of the truncation and was never
reset later. This commit fixes the issue by changing VACUUM so that
the resource usage snapshot is reset at each batch.
Back-patch to all supported branches.
Reported-by: Tatsuhito Kasahara
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/CAP0=ZVJsf=NvQuy+QXQZ7B=ZVLoDV_JzsVC1FRsF1G18i3zMGg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit b10714080 moved the GinPageSetDeleteXid() call to a spot where
the "page" variable was pointing to the wrong page, causing the XID
to be inserted on a page that's not being deleted, thus allowing later
GinPageIsRecyclable tests to recycle the deleted page too soon.
It might be a good idea to stop using the single "page" variable for
multiple purposes in this function. But for the moment I just moved
the GinPageSetDeleteXid() call down beside the GinPageSetDeleted()
call, which seems like a more logical place for it anyway.
Back-patch to v11, as the faulty patch was. (Fortunately, the bug
hasn't made it into any release yet.)
Discussion: https://postgr.es/m/21620.1581098806@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tuple conversion incorrectly concluded that no conversion was needed
as long as all the attributes lined up. But if the source tuple has a
missing attribute (from addition of a column with default), then the
destination tupdesc might not reflect the same default. The typical
symptom was that the affected columns would be unexpectedly NULL.
Repair by always forcing conversion if the source has missing
attributes, which will be filled in by the deform operation. (In
theory we could optimize for when the destination has the same
default, but that seemed overkill.)
Backpatch to 11 where missing attributes were added.
Per bug #16242.
Vik Fearing (discovery, code, testing) and me (analysis, testcase).
Discussion: https://postgr.es/m/16242-d1c9fca28445966b@postgresql.org
|
|
|
|
|
|
|
|
|
|
| |
Commit 74618e77 added a new check intended to fix a bug, but put
it in the wrong place so that parallel btree build was always
disabled. Do the check after we've actually tried to create
a DSM segment. Back-patch to 11, like the earlier commit.
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-WzmDABkJzrNnvf%2BOULK-_A_j9gkYg_Dz-H62jzNv4eKQTw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If no DSM slots are available, a ParallelContext can still be
created, but its seg pointer is NULL. Teach parallel btree build
to cope with that by falling back to a regular non-parallel build,
to avoid crashing with a segmentation fault.
Back-patch to 11, where parallel CREATE INDEX landed.
Reported-by: Nicola Contu
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CA%2BhUKGJgJEBnkuODBVomyK3MWFvDBbMVj%3Dgdt6DnRPU-5sQ6UQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The BRIN add_value() and union() functions need to make a longer-lived
copy of the argument, if they want to store it in the BrinValues struct
also passed as argument. The functions for the "inclusion operator
classes" used with box, range and inet types didn't take into account
that the union helper function might return its argument as is, without
making a copy. Check for that case, and make a copy if necessary. That
case arises at least with the range_union() function, when one of the
arguments is an 'empty' range:
CREATE TABLE brintest (n numrange);
CREATE INDEX brinidx ON brintest USING brin (n);
INSERT INTO brintest VALUES ('empty');
INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric));
INSERT INTO brintest VALUES ('(-1, 0)');
SELECT brin_desummarize_range('brinidx', 0);
SELECT brin_summarize_range('brinidx', 0);
Backpatch down to 9.5, where BRIN was introduced.
Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi
Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the value null only at pg_stat_activity-output time, as suggested
by Tom Lane, instead of messing with the internal state. This should
appease buildfarm members with force_parallel_mode=regress, which are
running parallel queries on logical replication walsenders.
The fact that walsenders can run parallel queries should perhaps be
studied more carefully, but for the moment let's get rid of the red
blots in buildfarm.
Backpatch to pg10, like the previous commit.
Discussion: https://postgr.es/m/30804.1578438763@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Returning a non-NULL time is pointless, sinc a walsender is not a
process that would be running normal transactions anyway, but the code
was unintentionally exposing the process start time intermittently,
which was not only bogus but it also confused monitoring systems looking
for idle transactions. Fix by avoiding all updates in walsenders.
Backpatch to 11, where walsenders started appearing in pg_stat_activity.
Reported-by: Tomas Vondra
Discussion: https://postgr.es/m/20191209234409.exe7osmyalwkt5j4@development
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the routines in charge of recycling WAL segments past the
last redo LSN to not use anymore "RedoRecPtr" as a local variable, which
is also available in the context of the session as a static declaration,
replacing it with "lastredoptr". This confusion has been introduced by
d9fadbf, so backpatch down to v11 like the other commit.
Thanks to Tom Lane, Robert Haas, Alvaro Herrera, Mark Dilger and Kyotaro
Horiguchi for the input provided.
Author: Ranier Vilela
Discussion: https://postgr.es/m/MN2PR18MB2927F7B5F690065E1194B258E35D0@MN2PR18MB2927.namprd18.prod.outlook.com
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This Assert thought that an overflowed transaction can never get registered
for the group update. But that is not true, because even when the number
of children for a transaction got reduced, the overflow flag is not
changed. And, for group update, we only care about the current number of
children for a transaction that is being committed.
Based on comments by Andres Freund, remove a redundant Assert in
TransactionIdSetPageStatus as we already had a static Assert for the same
condition a few lines earlier.
Reported-by: Vignesh C
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/CAFiTN-s5=uJw-Z6JC9gcqtBSjXsrHnU63PXBrA=pnBjqnkm5UA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit a7ee7c8513 fixed a bug in GiST page split during index creation,
where we failed to re-find the position of a downlink after the page
containing it was split. However, that fix was incomplete; the other call
to gistinserttuples() in the same function needs to also clear
'downlinkoffnum'.
Fixes bug #16134 reported by Alexander Lakhin, for real this time. The
previous fix was enough to fix the crash with the reproducer script for
bug #16162, but the original script for #16134 was still crashing.
Backpatch to v12, like the previous incomplete fix.
Discussion: https://www.postgresql.org/message-id/d869f537-abe4-d2ea-0510-38cd053f5152%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug was similar to the one that was fixed in commit 22251686f0. When
we split page X and insert the downlink for the new page, the parent page
might also need to be split. When that happens, the downlink offset number
we remembered for X is no longer valid. We correctly called
gistFindCorrectParent() to re-find it, but gistFindCorrectParent() doesn't
do anything if the LSN of the page hasn't changed, and we stopped updating
LSNs during index build in commit 9155580fd5. The buggy codepath was taken
if the page was split into three or more pages, and inserting the downlink
caused the parent page to split. To fix, explicitly mark the downlink
offset number as invalid, to force gistFindCorrectParent() to re-find it.
Fixes bug #16134 reported by Alexander Lakhin, reported again as #16162 by
Andreas Kunert. Thanks to Jeff Janes, Tom Lane and Tomas Vondra for
debugging. Backpatch to v12, where we stopped WAL-logging during index
build.
Discussion: https://www.postgresql.org/message-id/16134-0423f729671dec64%40postgresql.org
Discussion: https://www.postgresql.org/message-id/16162-45d21b7b6c1a3105%40postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If LISTEN is the only action in a serializable-mode transaction,
and the session was not previously listening, and the notify queue
is not empty, predicate.c reported an assertion failure. That
happened because we'd acquire the transaction's initial snapshot
during PreCommit_Notify, which was called *after* predicate.c
expects any such snapshot to have been established.
To fix, just swap the order of the PreCommit_Notify and
PreCommit_CheckForSerializationFailure calls during CommitTransaction.
This will imply holding the notify-insertion lock slightly longer,
but the difference could only be meaningful in serializable mode,
which is an expensive option anyway.
It appears that this is just an assertion failure, with no
consequences in non-assert builds. A snapshot used only to scan
the notify queue could not have been involved in any serialization
conflicts, so there would be nothing for
PreCommit_CheckForSerializationFailure to do except assign it a
prepareSeqNo and set the SXACT_FLAG_PREPARED flag. And given no
conflicts, neither of those omissions affect the behavior of
ReleasePredicateLocks. This admittedly once-over-lightly analysis
is backed up by the lack of field reports of trouble.
Per report from Mark Dilger. The bug is old, so back-patch to all
supported branches; but the new test case only goes back to 9.6,
for lack of adequate isolationtester infrastructure before that.
Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com
Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
|
|
|
|
|
|
|
|
| |
By oversight 52ac6cd2d0 makes ginDeletePage() sets pd_prune_xid of page to be
deleted before entering the critical section. It appears that only versions 11
and later were affected by this oversight.
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We find GIN concurrency bugs from time to time. One of the problems here is
that concurrency of GIN isn't well-documented in README. So, it might be even
hard to distinguish design bugs from implementation bugs.
This commit revised concurrency section in GIN README providing more details.
Some examples are illustrated in ASCII art.
Also, this commit add the explanation of how is tuple layout in internal GIN
B-tree page different in comparison with nbtree.
Discussion: https://postgr.es/m/CAPpHfduXR_ywyaVN4%2BOYEGaw%3DcPLzWX6RxYLBncKw8de9vOkqw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current GIN code appears to don't handle traversing to the deleted page via
downlink. This commit fixes that by stepping right from the delete page like
we do in nbtree.
This commit also fixes setting 'deleted' flag to the GIN pages. Now other page
flags are not erased once page is deleted. That helps to keep our assertions
true if we arrive deleted page via downlink.
Discussion: https://postgr.es/m/CAPpHfdvMvsw-NcE5bRS7R1BbvA4BxoDnVVjkXC5W0Czvy9LVrg%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ginDeletePage() is about to delete page it locks its left sibling to revise
the rightlink. So, it locks pages in right to left manner. Int he same time
ginStepRight() locks pages in left to right manner, and that could cause a
deadlock.
This commit makes ginScanToDelete() keep exclusive lock on left siblings of
currently investigated path. That elimites need to relock left sibling in
ginDeletePage(). Thus, deadlock with ginStepRight() can't happen anymore.
Reported-by: Chen Huajun
Discussion: https://postgr.es/m/5c332bd1.87b6.16d7c17aa98.Coremail.chjischj%40163.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 10
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there is the WAL page that the continuation WAL record just fits within
(i.e., the continuation record ends just at the end of the page) and
the LSN in such page is specified with -s option, previously pg_waldump
caused an assertion failure. The cause of this assertion failure was that
XLogFindNextRecord() that pg_waldump -s calls mistakenly handled
such special WAL page.
This commit changes XLogFindNextRecord() so that it can handle
such WAL page correctly.
Back-patch to all supported versions.
Author: Andrey Lepikhov
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
9155580 has changed the value of the first fake LSN for unlogged
relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a
non-zero LSN on some pages to allow an interlocking logic to work, but
its value was still initialized to 1 at the beginning of recovery or
after running pg_resetwal. This fixes the initialization for both code
paths.
Author: Takayuki Tsunakawa
Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We memorize all internal and empty leaf pages in the 1st vacuum stage for
gist indexes. They are used in the 2nd stage, to delete all the empty
pages. There was a memory context page_set_context for this purpose, but
we never used it.
Reported-by: Amit Kapila
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 12, where it got introduced
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
recovery_min_apply_delay parameter is intended for use with streaming
replication deployments. However, the document clearly explains that
the parameter will be honored in all cases if it's specified. So it should
take effect even if in archive recovery. But, previously, archive recovery
with recovery_min_apply_delay enabled always failed, and caused assertion
failure if --enable-caasert is enabled.
The cause of this problem is that; the ownership of recoveryWakeupLatch
that recovery_min_apply_delay uses was taken only when standby mode
is requested. So unowned latch could be used in archive recovery, and
which caused the failure.
This commit changes recovery code so that the ownership of
recoveryWakeupLatch is taken even in archive recovery. Which prevents
archive recovery with recovery_min_apply_delay from failing.
Back-patch to v9.4 where recovery_min_apply_delay was added.
Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In v11 or before, this setting could not take effect in crash recovery
because it's specified in recovery.conf and crash recovery always
starts without recovery.conf. But commit 2dedf4d9a8 integrated
recovery.conf into postgresql.conf and which unexpectedly allowed
this setting to take effect even in crash recovery. This is definitely
not good behavior.
To fix the issue, this commit makes crash recovery always ignore
recovery_min_apply_delay setting.
Back-patch to v12 where the issue was added.
Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In v11 or before, those settings could not take effect in crash recovery
because they are specified in recovery.conf and crash recovery always
starts without recovery.conf. But commit 2dedf4d9a8 integrated
recovery.conf into postgresql.conf and which unexpectedly allowed
those settings to take effect even in crash recovery. This is definitely
not good behavior.
To fix the issue, this commit makes crash recovery always ignore
restore_command and recovery_end_command settings.
Back-patch to v12 where the issue was added.
Author: Fujii Masao
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The file descriptor was opened with read-only to fsync a regular file,
which would cause EBADFD errors on some platforms.
This is similar to the recent fix done by a586cc4b (which was broken by
me with 82a5649), except that I noticed this issue while monitoring the
backend code for similar mistakes. Backpatch to 9.4, as this has been
introduced since logical decoding exists as of b89e151.
Author: Michael Paquier
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cbc55da has reworked the order of some actions at the end of archive
recovery. Unfortunately this overlooked the fact that the startup
process needs to remove RECOVERYXLOG (for temporary WAL segment newly
recovered from archives) and RECOVERYHISTORY (for temporary history
file) at this step, leaving the files around even after recovery ended.
Backpatch to 9.5, like the previous commit.
Author: Sawada Masahiko
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com
Backpatch-through: 9.5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In v11 or before, recovery target settings could not take effect in
crash recovery because they are specified in recovery.conf and
crash recovery always starts without recovery.conf. But commit
2dedf4d9a8 integrated recovery.conf into postgresql.conf and
which unexpectedly allowed recovery target settings to take effect
even in crash recovery. This is definitely not good behavior.
To fix the issue, this commit makes crash recovery always ignore
recovery target settings.
Back-patch to v12.
Author: Peter Eisentraut
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In-core relation options can use a custom lock mode since 47167b7, that
has lowered the lock available for some autovacuum parameters. However
it forgot to consider custom relation options. This causes failures
with ALTER TABLE SET when changing a custom relation option, as its lock
is not defined. The existing APIs to define a custom reloption does not
allow to define a custom lock mode, so enforce its initialization to
AccessExclusiveMode which should be safe enough in all cases. An
upcoming patch will extend the existing APIs to allow a custom lock mode
to be defined.
The problem can be reproduced with bloom indexes, so add a test there.
Reported-by: Nikolay Sharplov
Analyzed-by: Thomas Munro, Michael Paquier
Author: Michael Paquier
Reviewed-by: Kuntal Ghosh
Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz
Backpatch-through: 9.6
|
|
|
|
|
|
|
|
|
| |
Our item contains only so->numberOfNonNullOrderBys of distances. Reflect that
in the loop upper bound.
Discussion: https://postgr.es/m/53536807-784c-e029-6e92-6da802ab8d60%40postgrespro.ru
Author: Nikita Glukhov
Backpatch-through: 12
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
6cae9d2c10 has added an error in freeing old values in
index_store_float8_orderby_distances() function. It looks for old value in
scan->xs_orderbynulls[i] after setting a new value there.
This commit fixes that. Also it removes short-circuit in handling
distances == NULL situation. Now distances == NULL will be treated the same
way as array with all null distances. That is, previous values will be freed
if any.
Reported-by: Tom Lane, Nikita Glukhov
Discussion: https://postgr.es/m/CAPpHfdu2wcoAVAm3Ek66rP%3Duo_C-D84%2B%2Buf1VEcbyi_caBXWCA%40mail.gmail.com
Discussion: https://postgr.es/m/426580d3-a668-b9d1-7b8e-f74d1a6524e0%40postgrespro.ru
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit improves subject in two ways:
* It removes ugliness of 02f90879e7, which stores distance values and null
flags in two separate arrays after GISTSearchItem struct. Instead we pack
both distance value and null flag in IndexOrderByDistance struct. Alignment
overhead should be negligible, because we typically deal with at most few
"col op const" expressions in ORDER BY clause.
* It fixes handling of "col op NULL" expression in KNN-SP-GiST. Now, these
expression are not passed to support functions, which can't deal with them.
Instead, NULL result is implicitly assumed. It future we may decide to
teach support functions to deal with NULL arguments, but current solution is
bugfix suitable for backpatch.
Reported-by: Nikita Glukhov
Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include newitemoff in rmgr desc output for nbtree page split records.
In passing, correct an obsolete comment that claimed that newitemoff is
only logged for _L variant nbtree page split WAL records.
Both issues were oversights in commit 2c03216d831, which revamped the
WAL format.
Author: Peter Geoghegan
Backpatch: 9.5-, where the WAL format was revamped.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to implement NULL LAST semantic GiST previously assumed distance to
the NULL value to be Inf. However, our distance functions can return Inf and
NaN for non-null values. In such cases, NULL LAST semantic appears to be
broken. This commit fixes that by introducing separate array of null flags for
distances.
Backpatch to all supported versions.
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously plain float comparison was used in GiST pairing heap. Such
comparison doesn't provide proper ordering for value sets containing Inf and Nan
values. This commit fixes that by usage of float8_cmp_internal(). Note, there
is remaining problem with NULL distances, which are represented as Inf in
pairing heap. It would be fixes in subsequent commit.
Backpatch to all supported versions.
Reported-by: Andrey Borodin
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Heikki Linnakangas
Backpatch-through: 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit
transaction block, the previous implementation would leave a
transaction block active in the ROLLBACK case but not the COMMIT case.
To fix for now, error out when using these commands not in an explicit
transaction block. This restriction could be lifted if a sensible
definition and implementation is found.
Bug: #15977
Author: fn ln <emuser20140816@gmail.com>
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously even if postmaster died and WaitLatch() woke up with that event
while pg_promote() was waiting for the standby promotion to finish,
pg_promote() did nothing special and kept waiting until timeout occurred.
This could cause a busy loop.
This patch make pg_promote() return false immediately when postmaster
dies, to avoid such a busy loop.
Back-patch to v12 where pg_promote() was added.
Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids getting a
Could not read from file ...: Success.
for a short read or write (since errno is not set in that case).
Instead, report a more specific error messages.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In what seems like a fit of misplaced optimization,
ExtractReplicaIdentity() accessed the relation's replica-identity
index without taking any lock on it. Usually, the surrounding query
already holds some lock so this is safe enough ... but in the case
of a previously-planned delete, there might be no existing lock.
Given a suitable test case, this is exposed in v12 and HEAD by an
assertion added by commit b04aeb0a0.
The whole thing's rather poorly thought out anyway; rather than
looking directly at the index, we should use the index-attributes
bitmap that's held by the parent table's relcache entry, as the
caller functions do. This is more consistent and likely a bit
faster, since it avoids a cache lookup. Hence, change to doing it
that way.
While at it, rather than blithely assuming that the identity
columns are non-null (with catastrophic results if that's wrong),
add assertion checks that they aren't null. Possibly those should
be actual test-and-elog, but I'll leave it like this for now.
In principle, this is a bug that's been there since this code was
introduced (in 9.4). In practice, the risk seems quite low, since
we do have a lock on the index's parent table, so concurrent
changes to the index's catalog entries seem unlikely. Given the
precedent that commit 9c703c169 wasn't back-patched, I won't risk
back-patching this further than v12.
Per report from Hadi Moshayedi.
Discussion: https://postgr.es/m/CAK=1=Wrek44Ese1V7LjKiQS-Nd-5LgLi_5_CskGbpggKEf3tKQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.
The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.
Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.
Backpatch to all supported versions.
This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.
Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi
Reviewed-by: Masahiko Sawada, Alexander Korotkov
|
|
|
|
|
| |
Author: Alexander Lakhin
Discussion: https://postgr.es/m/20190819072244.GE18166@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In serializable mode, heap_hot_search_buffer() incorrectly acquired a
predicate lock on the root tuple, not the returned tuple that satisfied
the visibility checks. As explained in README-SSI, the predicate lock does
not need to be copied or extended to other tuple versions, but for that to
work, the correct, visible, tuple version must be locked in the first
place.
The original SSI commit had this bug in it, but it was fixed back in 2013,
in commit 81fbbfe335. But unfortunately, it was reintroduced a few months
later in commit b89e151054. Wising up from that, add a regression test
to cover this, so that it doesn't get reintroduced again. Also, move the
code that sets 't_self', so that it happens at the same time that the
other HeapTuple fields are set, to make it more clear that all the code in
the loop operate on the "current" tuple in the chain, not the root tuple.
Bug spotted by Andres Freund, analysis and original fix by Thomas Munro,
test case and some additional changes to the fix by Heikki Linnakangas.
Backpatch to all supported versions (9.4).
Discussion: https://www.postgresql.org/message-id/20190731210630.nqhszuktygwftjty%40alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, after a deleted page gets even older, it becomes unrecyclable
again. B-tree has the same problem, and has had since time immemorial,
but let's at least fix this in GiST, where this is new.
Backpatch to v12, where GiST page deletion was introduced.
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru
|