| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
Since e27f4ee0a701, fastgetattr() and heap_getattr() are not macros, but
inlined functions.
Author: Junwang Zhao
Reviewed-by: Stepan Neretin
Discussion: https://postgr.es/m/CAEG8a3JS-JKWWyOcM7BU=vPqFXa3W7mZSHnvc3CBqx=tC+3SCA@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Make the isolation harness recognize injection_points wait events as a
type of blocked state. Test an extant inplace-update bug.
Reviewed by Robert Haas and Michael Paquier.
Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We did not recover the subtransaction IDs of prepared transactions
when starting a hot standby from a shutdown checkpoint. As a result,
such subtransactions were considered as aborted, rather than
in-progress. That would lead to hint bits being set incorrectly, and
the subtransactions suddenly becoming visible to old snapshots when
the prepared transaction was committed.
To fix, update pg_subtrans with prepared transactions's subxids when
starting hot standby from a shutdown checkpoint. The snapshots taken
from that state need to be marked as "suboverflowed", so that we also
check the pg_subtrans.
Backport to all supported versions.
Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. TruncateMultiXact() performs the SLRU truncations in a critical
section. Deleting the SLRU segments calls ForwardSyncRequest(), which
will try to compact the request queue if it's full
(CompactCheckpointerRequestQueue()). That in turn allocates memory,
which is not allowed in a critical section. Backtrace:
TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981
postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e]
postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d]
postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e]
postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb]
postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a]
postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1]
postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b]
postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3]
postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66]
postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d]
postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead]
postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e]
postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb]
postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e]
/lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45]
postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31]
To fix, bail out in CompactCheckpointerRequestQueue() without doing
anything, if it's called in a critical section. That covers the above
call path, as well as any other similar cases where
RegisterSyncRequest might be called in a critical section.
2. After fixing that, another problem became apparent: Autovacuum
process doing that truncation can deadlock with the checkpointer
process. TruncateMultiXact() sets "MyProc->delayChkptFlags |=
DELAY_CHKPT_START". If the sync request queue is full and cannot be
compacted, the process will repeatedly sleep and retry, until there is
room in the queue. However, if the checkpointer is trying to start a
checkpoint at the same time, and is waiting for the DELAY_CHKPT_START
processes to finish, the queue will never shrink.
More concretely, the autovacuum process is stuck here:
#0 0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570
#2 WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1,
wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516
#3 0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949)
at ../src/backend/storage/ipc/latch.c:538
#4 0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614
#5 0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495
#6 0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566
#7 0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006
#8 TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201
#9 0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917
#10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760
#11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550
#12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569
and the checkpointer is stuck here:
#0 0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50
#3 0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098
#4 0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464
To fix, add AbsorbSyncRequests() to the loops where the checkpointer
waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to
finish.
Backpatch to v14. Before that, SLRU deletion didn't call
RegisterSyncRequest, which avoided this failure. I'm not sure if there
are other similar scenarios on older versions, but we haven't had
any such reports.
Discussion: https://www.postgresql.org/message-id/ccc66933-31c1-4f6a-bf4b-45fef0d4f22e@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_bt_advance_array_keys didn't take sufficient care at the point where it
decides whether to start a new primitive index scan based on a call to
_bt_check_compare against finaltup (a call with the scan direction
flipped around). The final decision was conditioned on rules about how
the scan key offset sktrig that initially triggered array advancement
(passed to _bt_advance_array_keys from its _bt_checkkeys caller)
compared to the offset set by its own _bt_check_compare finaltup call.
This approach was faulty, in that it allowed _bt_advance_array_keys to
incorrectly start a new primitive index scan, that landed on the same
leaf page (on assert-enabled builds it led to an assertion failure).
In general, scans with array keys are expected to never have to read the
same leaf page more than once (barring cases involving cursors, and
cases where the scan restores a marked position for the inner side of a
merge join). This principle was established by commit 5bf748b8.
To fix, make the final decision based on whether the scan key offset set
by the _bt_check_compare finaltup call is an offset to an inequality
strategy scan key. An unsatisfied required inequality strategy scan key
indicates that all of the scan's required equality strategy scan keys
must also be satisfied by finaltup (not just by caller's tuple), and
that there is a decent chance that _bt_first will be able to reposition
the scan to a position many leaf pages ahead of the current leaf page.
Oversight in commit 5bf748b8.
Discussion: https://postgr.es/m/CAH2-Wz=DyHbcg7o6zXqzyiin8WE8vzk4tvU8Lrnh-a=EAvO0TQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may
insert some REDIRECT tuples. This will typically happen in
a transaction that lacks an XID, which leads either to assertion
failure in spgFormDeadTuple or to insertion of a REDIRECT tuple
with zero xid. The latter's not good either, since eventually
VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid,
resulting in either an assertion failure or a garbage answer.
In practice, since REINDEX CONCURRENTLY locks out index scans
till it's done, it doesn't matter whether it inserts REDIRECTs
or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM
reduces such a REDIRECT to a PLACEHOLDER. So in non-assert builds
there's no observable problem here, other than perhaps a little
index bloat. But it's not behaving as intended.
To fix, remove the failing Assert in spgFormDeadTuple, acknowledging
that we might sometimes insert a zero XID; and guard VACUUM's
GlobalVisTestIsRemovableXid() call with a test for valid XID,
ensuring that we'll reduce such a REDIRECT the first time VACUUM
sees it. (Versions before v14 use TransactionIdPrecedes here,
which won't fail on zero xid, so they really have no bug at all
in non-assert builds.)
Another solution could be to not create REDIRECTs at all during
REINDEX CONCURRENTLY, making the relevant code paths treat that
case like index build (which likewise knows that no concurrent
index scans can be happening). That would allow restoring the
Assert in spgFormDeadTuple, but we'd still need the VACUUM change
because redirection tuples with zero xid may be out there already.
But there doesn't seem to be a nice way for spginsert() to tell that
it's being called in REINDEX CONCURRENTLY without some API changes,
so we'll leave that as a possible future improvement.
In HEAD, also rename the SpGistState.myXid field to redirectXid,
which seems less misleading (since it might not in fact be our
transaction's XID) and is certainly less uninformatively generic.
Per bug #18499 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The macros were confused about the argument data types. All the
arguments were called 'xid', and some of the macros included casts to
TransactionId, even though the arguments were actually either
MultiXactIds or MultiXactOffsets. It compiles to the same thing,
because TransactionId, MultiXactId and MultiXactOffset are all
typedefs of uint32, but it was highly misleading.
Author: Maxim Orlov <orlovmg@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACG%3DezbLUG-OD1osAW3OchOMxZtdxHh2itYR9Zhh-a13wEBEQw%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/ff143b24-a093-40da-9833-d36b83726bdf%40iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 667e65aac3 changed both num_dead_tuples and max_dead_tuples
columns to dead_tuple_bytes and max_dead_tuple_bytes columns,
respectively. But as per discussion, the number of dead tuples
collected still provides meaningful insights for users.
This commit reintroduces the column for the count of dead tuples,
renamed as num_dead_item_ids. It avoids confusion with the number of
dead tuples removed by VACUUM, which includes dead heap-only tuples
but excludes any pre-existing LP_DEAD items left behind by
opportunistic pruning.
Bump catalog version.
Reviewed-by: Peter Geoghegan, Álvaro Herrera, Andrey Borodin
Discussion: https://postgr.es/m/CAD21AoBL5sJE9TRWPyv%2Bw7k5Ee5QAJqDJEDJBUdAaCzGWAdvZw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of the function is to reduce the effective
autovacuum_multixact_freeze_max_age if the multixact members SLRU is
approaching wraparound, to make multixid freezing more aggressive.
The returned value should therefore never be greater than plain
autovacuum_multixact_freeze_max_age.
Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/85fb354c-f89f-4d47-b3a2-3cbd461c90a3@iki.fi
Backpatch-through: 12, all supported versions
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places. These
inconsistencies were all introduced during Postgres 17 development.
pg_bsd_indent still has a couple of similar inconsistencies, which I
(pgeoghegan) have left untouched for now.
This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ReorderBufferImmediateInvalidation() executes invalidation messages in
an aborted transaction. However, RelationFlushRelation sometimes
required a valid resource owner, to temporarily increment the refcount
of the relache entry. Commit b8bff07daa worked around that in the main
subtransaction abort function, AbortSubTransaction(), but missed this
similar case in ReorderBufferImmediateInvalidation().
To fix, introduce a separate function to invalidate a relcache
entry. It does the same thing as RelationClearRelation(rebuild==true)
does when outside a transaction, but can be called without
incrementing the refcount.
Add regression test. Before this fix, it failed with:
ERROR: ResourceOwnerEnlarge called after release started
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/e56be7d9-14b1-664d-0bfc-00ce9772721c@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After further review, we want to move in the direction of always
quoting GUC names in error messages, rather than the previous (PG16)
wildly mixed practice or the intermittent (mid-PG17) idea of doing
this depending on how possibly confusing the GUC name is.
This commit applies appropriate quotes to (almost?) all mentions of
GUC names in error messages. It partially supersedes a243569bf65 and
8d9978a7176, which had moved things a bit in the opposite direction
but which then were abandoned in a partial state.
Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
04e72ed617be pushed the skip fetch optimization (allowing bitmap heap
scans to operate like index-only scans if none of the underlying data is
needed) into heap AM implementations of bitmap table scan callbacks.
04e72ed617be added an assert that all tuples in blocks eligible for the
optimization had been NULL-filled and emitted by the end of the scan.
This assert is incorrect when not all tuples need be scanned to execute
the query; for example: a join in which not all inner tuples need to be
scanned before skipping to the next outer tuple.
Remove the assert and reset the field on which it previously asserted to
avoid incorrectly emitting NULL-filled tuples from a previous scan on
rescan.
Author: Melanie Plageman
Reviewed-by: Tomas Vondra, Michael Paquier, Alvaro Herrera
Reported-by: Melanie Plageman
Reproduced-by: Tomas Vondra, Richard Guo
Discussion: https://postgr.es/m/CAMbWs48orzZVXa7-vP9Nt7vQWLTE04Qy4PePaLQYsVNQgo6qRg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This feature set did not handle empty ranges correctly, and it's now
too late for PostgreSQL 17 to fix it.
The following commits are reverted:
6db4598fcb8 Add stratnum GiST support function
46a0cd4cefb Add temporal PRIMARY KEY and UNIQUE constraints
86232a49a43 Fix comment on gist_stratnum_btree
030e10ff1a3 Rename pg_constraint.conwithoutoverlaps to conperiod
a88c800deb6 Use daterange and YMD in without_overlaps tests instead of tsrange.
5577a71fb0c Use half-open interval notation in without_overlaps tests
34768ee3616 Add temporal FOREIGN KEY contraints
482e108cd38 Add test for REPLICA IDENTITY with a temporal key
c3db1f30cba doc: clarify PERIOD and WITHOUT OVERLAPS in CREATE TABLE
144c2ce0cc7 Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes
Discussion: https://www.postgresql.org/message-id/d0b64a7a-dfe4-4b84-a906-c7dedfa40a3e@eisentraut.org
|
|
|
|
|
| |
Author: Alexander Lakhin
Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A parallel worker's buffer usage is accumulated to its pgBufferUsage
and then is accumulated into the leader's one at the end of the
parallel vacuum. However, since the leader process used to use
dedicated VacuumPage{Hit, Miss, Dirty} globals for the buffer usage
reporting, the worker's buffer usage was not included, leading to an
incorrect buffer usage report.
To fix the problem, this commit makes vacuum use pgBufferUsage
instruments for buffer usage reporting instead of VacuumPage{Hit,
Miss, Dirty} globals. These global variables are still used by ANALYZE
command and autoanalyze.
This also fixes the buffer usage report of vacuuming on temporary
tables, since the buffers dirtied by MarkLocalBufferDirty() were not
tracked by the VacuumPageDirty variable.
Parallel vacuum was introduced in 13, but the buffer usage reporting
for VACUUM command with the VERBOSE option was implemented in
15. So backpatch to 15.
Reported-by: Anthonin Bonnefoy
Author: Anthonin Bonnefoy
Reviewed-by: Alena Rybakina, Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_XqrQk+QZQcYs_C6nk0cMfHuUWk85vT9CrcA1NffFbAVE2A@mail.gmail.com
Backpatch-through: 15
|
|
|
|
|
|
|
|
|
| |
Repeating loads of inplace-updated fields tends to cause bugs like the
one from the previous commit. While there's no bug to fix in these code
sites, adopt the load-once style. This improves the chance of future
copy/paste finding the safe style.
Discussion: https://postgr.es/m/20240423003956.e7.nmisch@google.com
|
|
|
|
|
|
|
| |
Also, fix a comment incorrectly referencing the "streaming read API".
This was renamed to "read stream" shortly before being committed.
Discussion: https://postgr.es/m/CAApHDvq-2Zdqytm_Hf3RmVf0qg5PS9jTFAJ5QTc9xH9pwvwDTA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Certain cases involving the use of cursors had assertion failures within
_bt_preprocess_keys's recently added no-op return path. The assertion
in question made the faulty assumption that a second or third call to
_bt_preprocess_keys (within the same btrescan) could only happen when
another scheduled primitive index scan was just about to begin.
It would be possible to address the problem by only allowing scans that
have array keys to take the new no-op path, forcing affected cases to
perform redundant preprocessing work. It seems simpler to just remove
the assertion, and reframe the no-op path as a more general mechanism.
Take this simpler approach.
The important underlying principle is that we only need to perform
preprocessing once per btrescan (at most). This is expected regardless
of whether or not the scan happens to have array keys.
Oversight in commit 1b134ca5, which enhanced nbtree ScalarArrayOp
execution.
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/ef0f7c8b-a6fa-362e-6fd6-054950f947ca@gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
This led to spurious assertion failures in certain scenarios involving
pseudo types.
Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
Reported-By: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs48f5rDOwxaT76Zd40m7n9iGZQcjEk7vG_5p3YWNh6oPfA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization for inserts into BRIN indexes added by c1ec02be1d79
relies on a cache that needs to be explicitly released after calling
index_insert(). The commit however failed to invoke the cleanup in
validate_index(), which calls index_insert() indirectly through
table_index_validate_scan().
After inspecting index_insert() callers, it seems unique_key_recheck()
is missing the call too.
Fixed by adding the two missing index_insert_cleanup() calls.
The commit does two additional improvements. The aminsertcleanup()
signature is modified to have the index as the first argument, to make
it more like the other AM callbacks. And the aminsertcleanup() callback
is invoked even if the ii_AmCache is NULL, so that it can decide if the
cleanup is necessary.
Author: Alvaro Herrera, Tomas Vondra
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/202401091043.e3nrqiad6gb7@alvherre.pgsql
|
|
|
|
|
|
|
|
| |
Typos introduced by commits c1ec02be1d79, b43757171470 and dae761a87eda.
Author: Alvaro Herrera
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/202401091043.e3nrqiad6gb7@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes various typos, duplicated words, and tiny bits of whitespace
mainly in code comments but also in docs.
Author: Daniel Gustafsson <daniel@yesql.se>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Alexander Lakhin <exclusion@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Preprocessing for nbtree index scans allowed array "input" scan keys
already marked eliminated during array-specific preprocessing to be
"fixed up" during preprocessing proper. This allowed eliminated scan
keys on DESC index columns to spurious have their strategy commuted,
causing assertion failures.
To fix, teach _bt_fix_scankey_strategy to ignore these scan keys. This
brings it in line with its only caller, _bt_preprocess_keys.
Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
Reported-By: Donghang Lin <donghanglin@gmail.com>
Discussion: https://postgr.es/m/CAA=D8a2sHK6CAzZ=0CeafC-Y-MFXbYxnRSHvZTi=+JHu6kAa8Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fefd9a3fed turned tail recursion of CommitTransactionCommand() and
AbortCurrentTransaction() into iteration. However, it splits the handling of
cases between different functions.
This commit puts the handling of all the cases into
AbortCurrentTransactionInternal() and CommitTransactionCommandInternal().
Now CommitTransactionCommand() and AbortCurrentTransaction() are just doing
the repeated calls of internal functions.
Reported-by: Andres Freund
Discussion: https://postgr.es/m/20240415224834.w6piwtefskoh32mv%40awork3.anarazel.de
Author: Andres Freund
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit b43757171470 added support for parallel builds of BRIN indexes,
using code similar to BTREE. But there were to be a couple unnecessary
differences, particularly in how the leader waits for the workers, and
merges the results. So remove these, to make the code more similar.
The leader never waited on the workersdonecv condition variable, but
simply called WaitForParallelWorkersToFinish() in _brin_end_parallel()
and then merged the per-worker results. This worked correctly, but it
seems better to do the wait and merge before _brin_end_parallel().
This commit moves the relevant code to _brin_parallel_heapscan/merge(),
which means _brin_end_parallel() remains responsible only for exiting
the parallel mode and accumulating WAL usage data.
Discussion: https://postgr.es/m/3733d042-71e1-6ae6-5fac-00c12db62db6@enterprisedb.com
|
|
|
|
| |
Oversight in commit c9c0589fda.
|
|
|
|
|
|
| |
This commit reverts 27bc1772fc and dd1f6b0c17. Per review by Andres Freund.
Discussion: https://postgr.es/m/20240415201057.khoyxbwwxfgzomeo%40awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6bcda4a721 replaced PG_DETOAST_DATUM with PG_DETOAST_DATUM_PACKED
in two BRIN output functions, for minmax-multi and bloom opclasses. But
this is incorrect - the code is accessing the data through structs that
already include a 4B header, so the detoast needs to match that. But the
PACKED macro may keep the 1B header, which means the struct fields will
point to incorrect data.
Backpatch-through: 16
Discussion: https://postgr.es/m/1df00a66-db5a-4e66-809a-99b386a06d86%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Properly update the number of bits set in the bitmap after merging the
filters in brin_bloom_union.
This is mostly harmless, as the counter is used only in the output
function, which means pageinspect may show incorrect information about
the BRIN summary. The counter does not affect correctness.
Discovered while adding a regression test comparing indexes built with
and without parallelism. The parallel index builds exercise the union
procedure when merging results from workers, which is otherwise very
hard to do in a test. Which is why this went unnoticed until now.
Backpatch through 14, where the BRIN bloom opclasses were introduced.
Backpatch-through: 14
Discussion: https://postgr.es/m/1df00a66-db5a-4e66-809a-99b386a06d86%40enterprisedb.com
|
|
|
|
|
|
|
| |
This commit reverts 06c418e163, e37662f221, bf1e650806, 25f42429e2,
ee79928441, and 74eaf66f98 per review by Heikki Linnakangas.
Discussion: https://postgr.es/m/b155606b-e744-4218-bda5-29379779da1a%40iki.fi
|
|
|
|
|
|
| |
This commit reverts 02eb07ea89 per review by Andres Freund.
Discussion: https://postgr.es/m/20240410165236.rwyrny7ihi4ddxw4%40awork3.anarazel.de
|
|
|
|
|
|
| |
This commit reverts c35a3fb5e0 per review by Andres Freund.
Discussion: https://postgr.es/m/20240410165236.rwyrny7ihi4ddxw4%40awork3.anarazel.de
|
|
|
|
|
|
| |
This commit reverts 87985cc925 and 818861eb57 per review by Andres Freund.
Discussion: https://postgr.es/m/20240410165236.rwyrny7ihi4ddxw4%40awork3.anarazel.de
|
|
|
|
|
|
| |
This commit reverts b1484a3f19 per review by Andres Freund.
Discussion: https://postgr.es/m/20240410165236.rwyrny7ihi4ddxw4%40awork3.anarazel.de
|
|
|
|
|
|
| |
This commit reverts 9bd99f4c26 and 422041542f per review by Andres Freund.
Discussion: https://postgr.es/m/20240410165236.rwyrny7ihi4ddxw4%40awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
aa5edbe379d6 has tweaked _hash_freeovflpage() so as the write buffer's
LSN is updated only when necessary, when REGBUF_NO_CHANGE is not used.
The replay code was not consistent with that, causing the write buffer's
LSN to be updated and its page to be marked as dirty even if the buffer
was registered in a "clean" state. This was possible for the case of a
squeeze record when there are no tuples to add to the write buffer, for
(is_prim_bucket_same_wrt && !is_prev_bucket_same_wrt).
I have performed some validation of this commit with
wal_consistency_checking and a change in WAL that logs REGBUF_NO_CHANGE
to a new BKPIMAGE_*. Thanks to that, it is possible to know at replay
if a buffer was clean when it was registered, then cross-checked the LSN
of the "clean" page copy coming from WAL with the LSN of the block once
the record has been replayed. This eats one bit in bimg_info, which is
not acceptable to be integrated as-is, but it could become handy in the
future. I didn't spot other areas than the one fixed by this commit at
the extent of what the main regression test suite covers.
As this is an oversight in aa5edbe379d6, no backpatch is required.
Reported-by: Zubeyr Eryilmaz
Author: Hayato Kuroda
Reviewed-by: Amit Kapila, Michael Paquier
Discussion: https://postgr.es/m/ZbyVVG_7eW3YD5-A@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a C11 feature, and we require C99. While at it, go the further
step and get rid of the surrounding union (with uintptr_t) entirely,
as there is currently no use case for this file to access the header of
BlocktableEntry as a uintptr_t, and there are no additional alignment
requirements. The least invasive way seems to be to transfer the old
union name to this struct.
Reported by Pavel Borisov and Andres Freund, per buildfarm member mylodon
Reviewed by Pavel Borisov
Discussion: https://postgr.es/m/CALT9ZEH11NYV8AOzKb1bWhCf6J0H=H31f0MgT9xX+HdqvcA1rw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the decision to store values in leaves or within the child
pointer was made at compile time, with variable length values using
leaves by necessity. This commit allows introspecting the length of
variable length values at runtime for that decision. This requires
the ability to tell whether the last-level child pointer is actually
a value, so we use a pointer tag in the lowest level bit.
Use this in TID store. This entails adding a byte to the header to
reserve space for the tag. Commit f35bd9bf3 stores up to three offsets
within the header with no bitmap, and now the header can be embedded
as above. This reduces worst-case memory usage when TIDs are sparse.
Reviewed (in an earlier version) by Masahiko Sawada
Discussion: https://postgr.es/m/CANWCAZYw+_KAaUNruhJfE=h6WgtBKeDG32St8vBJBEY82bGVRQ@mail.gmail.com
Discussion: https://postgr.es/m/CAD21AoBci3Hujzijubomo1tdwH3XtQ9F89cTNQ4bsQijOmqnEw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
The header portion of BlocktableEntry has enough padding space for
an array of 3 offsets (1 on 32-bit platforms). Use this space instead
of having a sparse bitmap array. This will take up a constant amount
of space no matter what the offsets are.
Reviewed (in an earlier version) by Masahiko Sawada
Discussion: https://postgr.es/m/CANWCAZYw+_KAaUNruhJfE=h6WgtBKeDG32St8vBJBEY82bGVRQ@mail.gmail.com
Discussion: https://postgr.es/m/CAD21AoBci3Hujzijubomo1tdwH3XtQ9F89cTNQ4bsQijOmqnEw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
While keeping API the same, this commit provides a way for block-level table
AMs to re-use existing acquire_sample_rows() by providing custom callbacks
for getting the next block and the next tuple.
Reported-by: Andres Freund
Discussion: https://postgr.es/m/20240407214001.jgpg5q3yv33ve6y3%40awork3.anarazel.de
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
| |
Reported-by: Thomas Munro
Reported-by: Pavel Borisov
Discussion: https://postgr.es/m/CA%2BhUKGLZzLR50RBvuqOO3MZ%3DF54ETz-rTp1PDX9uDGP_GqyYqA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let table AM define custom reloptions for its tables. This allows specifying
AM-specific parameters by the WITH clause when creating a table.
The reloptions, which could be used outside of table AM, are now extracted
into the CommonRdOptions data structure. These options could be by decision
of table AM directly specified by a user or calculated in some way.
The new test module test_tam_options evaluates the ability to set up custom
reloptions and calculate fields of CommonRdOptions on their base.
The code may use some parts from prior work by Hao Wu.
Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com
Discussion: https://postgr.es/m/AMUA1wBBBxfc3tKRLLdU64rb.1.1683276279979.Hmail.wuhao%40hashdata.cn
Reviewed-by: Reviewed-by: Pavel Borisov, Matthias van de Meent, Jess Davis
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vacuum does not pfree individual entries, and only frees the entire
storage space when finished with it. This allows using a bump context,
eliminating the chunk header in each leaf allocation. Most leaf
allocations will be 16 to 32 bytes, so that's a significant savings.
TidStoreCreateLocal gets a boolean parameter to indicate that the
created store is insert-only.
This requires a separate tree context for iteration, since we free
the iteration state after iteration completes.
Discussion: https://postgr.es/m/CANWCAZac%3DpBePg3rhX8nXkUuaLoiAJJLtmnCfZsPEAS4EtJ%3Dkg%40mail.gmail.com
Discussion: https://postgr.es/m/CANWCAZZQFfxvzO8yZHFWtQV+Z2gAMv1ku16Vu7KWmb5kZQyd1w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
In a97bbe1f1df I accidentally referenced heapgetpage(), both in a function
name and a comment. But since 44086b09753 the relevant function is named
heap_prepare_pagescan(). Rename the new function to page_collect_tuples().
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20240407172615.cocrsvboqm3ttqe4@awork3.anarazel.de
Discussion: https://postgr.es/m/CAApHDvp4SniHopTrVeKWcEvNXFtdki0utAvO=5R7H6TNhtULRQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
Declare with union similar to PGAlignedBlock.
Report and fix by Andres Freund
Discussion: https://postgr.es/m/20240407190731.izm3mdazednrsiqk%40awork3.anarazel.de
|
|
|
|
|
|
|
|
| |
One of the assertions was the subject of a false positive complaint from
Coverity, but none of the assertions added much, so get rid of them.
Reported-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3000247.1712537309@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ANALYZE command prefetches and reads sample blocks chosen by a
BlockSampler algorithm. Instead of calling [Prefetch|Read]Buffer() for
each block, ANALYZE now uses the streaming API introduced in b5a9b18cd0.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/flat/CAN55FZ0UhXqk9v3y-zW_fp4-WCp43V8y0A72xPmLkOM%2B6M%2BmJg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In one multixact.c edge case, we need a mechanism to wait for one
multixact offset to be written before being allowed to read the next
one. We used to handle this case by sleeping for one millisecond and
retrying, but such sleeps have been reported as problematic in
production cases. We can avoid the problem by using a condition
variable: readers sleep on it and then every creator of multixacts
broadcasts into the CV when creation is sufficiently far along.
Author: Kyotaro Horiguchi <horikyotajntt@gmail.com>
Reviewed-by: Andrey Borodin <amborodin@acm.org>
Discussion: https://postgr.es/m/47A598F4-B4E7-4029-8FEC-A06A6C3CB4B5@yandex-team.ru
Discussion: https://postgr.es/m/20200515.090333.24867479329066911.horikyota.ntt
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nbtree index scans with SAOP inequalities (but no SAOP equalities)
performed extra ORDER proc lookups for any remaining equality strategy
scan keys. This could waste cycles, and caused assertion failures.
Keeping around a separate ORDER proc is only necessary for a scan's
non-array/non-SAOP equality scan keys when the scan has at least one
other SAOP equality strategy key (a SAOP inequality shouldn't count).
To fix, replace _bt_preprocess_array_keys_final's assertion with a test
that makes the function return early when the scan has no SAOP equality
scan keys.
Oversight in commit 1b134ca5, which enhanced nbtree ScalarArrayOp
execution.
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/0539d3d3-a402-0a49-ed5e-26429dffc4bd@gmail.com
|