aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Fix GIN's shimTriConsistentFn to not corrupt its input.Tom Lane2025-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0f21db36d made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13
* Reset InstallXLogFileSegmentActive after walreceiver self-initiated exit.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | | | | | | After commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 added this flag, failure to reset it caused assertion failures. In non-assert builds, it made the system fail to achieve the objectives listed in that commit; chiefly, we might emit a spurious log message. Back-patch to v15, where that commit first appeared. Bharath Rupireddy and Kyotaro Horiguchi. Reviewed by Dilip Kumar, Nathan Bossart and Michael Paquier. Reported by Dilip Kumar. This commit has been applied as of b4f584f9d2a1 in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me. Tests have been conducted by the both of us. Discussion: https://postgr.es/m/CAFiTN-sE3ry=ycMPVtC+Djw4Fd7gbUGVv_qqw6qfzp=JLvqT3g@mail.gmail.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Skip WAL recycling and preallocation during archive recovery.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous commit addressed the chief consequences of a race condition between InstallXLogFileSegment() and KeepFileRestoredFromArchive(). Fix three lesser consequences. A spurious durable_rename_excl() LOG message remained possible. KeepFileRestoredFromArchive() wasted the proceeds of WAL recycling and preallocation. Finally, XLogFileInitInternal() could return a descriptor for a file that KeepFileRestoredFromArchive() had already unlinked. That felt like a recipe for future bugs. This commit has been applied as of cc2c7d65fc27 in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me, Noah Misch is the author. Tests have been conducted by the both of us. Note that this commit is known to have introduced a regression of its own. This is fixed by the commit following this one, and not grouped in a single commit to keep the commit history consistent across all branches. Reported-by: Arun Thirupathi Author: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Don't ERROR on PreallocXlogFiles() race condition.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | | | | | | | Before a restartpoint finishes PreallocXlogFiles(), a startup process KeepFileRestoredFromArchive() call can unlink the preallocated segment. If a CHECKPOINT sql command had elicited the restartpoint experiencing the race condition, that sql command failed. Moreover, the restartpoint omitted its log_checkpoints message and some inessential resource reclamation. Prevent the ERROR by skipping open() of the segment. Since these consequences are so minor, no back-patch. This commit has been applied as of 2b3e4672f760 in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me, Noah Misch is the author. Tests have been conducted by the both of us. Reported-by: Arun Thirupathi Author: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Revert "Add HINT for restartpoint race with KeepFileRestoredFromArchive()."Michael Paquier2025-04-06
| | | | | | | | | | | | | | | | This reverts commit 8ad6c5dbbe5a, which was a commit specific to v14 and older branches as the race condition between restartpoints and KeepFileRestoredFromArchive() still existed. 1f95181b44c8 has worsened the situation on these two branches, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The same logic as v15 and newer versions will be applied in some follow-up commits to close this problem, making this HINT not necessary anymore. Reported-by: Arun Thirupathi Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Remove XLogFileInit() ability to unlink a pre-existing file.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | | | Only initdb used it. initdb refuses to operate on a non-empty directory and generally does not cope with pre-existing files of other kinds. Hence, use the opportunity to simplify. This commit has been applied as of 421484f79c0b in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me, Noah Misch is the author. Tests have been conducted by the both of us. Reported-by: Arun Thirupathi Author: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* In XLogFileInit(), fix *use_existent postcondition to suit callers.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | | | Infrequently, the mismatch caused log_checkpoints messages and TRACE_POSTGRESQL_CHECKPOINT_DONE() to witness an "added" count too high by one. Since that consequence is so minor, no back-patch. This commit has been applied as of 85656bc3050f in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me, Noah Misch is the author. Tests have been conducted by the both of us. Reported-by: Arun Thirupathi Author: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Remove XLogFileInit() ability to skip ControlFileLock.Noah Misch2025-04-06
| | | | | | | | | | | | | | | | | | Cold paths, initdb and end-of-recovery, used it. Don't optimize them. This commit has been applied as of c53c6b98d38a in v15 and newer versions. This is required on stable branches of v13 and v14 to fix a regression reported by Noah Misch, introduced by 1f95181b44c8, causing spurious failures in archive recovery (neither streaming nor archive recovery) with concurrent restartpoints. The backpatched versions of the patches have been aligned on these branches by me, Noah Misch is the author. Tests have been conducted by the both of us. Reported-by: Arun Thirupathi Author: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13
* Relax assertion in finding correct GiST parentHeikki Linnakangas2025-04-04
| | | | | | | | | | | | | | | | | | Commit 28d3c2ddcf introduced an assertion that if the memorized downlink location in the insertion stack isn't valid, the parent's LSN should've changed too. Turns out that was too strict. In gistFindCorrectParent(), if we walk right, we update the parent's block number and clear its memorized 'downlinkoffnum'. That triggered the assertion on next call to gistFindCorrectParent(), if the parent needed to be split too. Relax the assertion, so that it's OK if downlinkOffnum is InvalidOffsetNumber. Backpatch to v13-, all supported versions. The assertion was added in commit 28d3c2ddcf in v12. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://www.postgresql.org/message-id/18396-03cac9beb2f7aac3@postgresql.org
* BRIN: be more strict about required support procsÁlvaro Herrera2025-03-11
| | | | | | | | | | | | | | | | | | | With improperly defined operator classes, it's possible to get a Postgres crash because we'd try to invoke a procedure that doesn't exist. This is because the code is being a bit too trusting that the opclass is correctly defined. Add some ereport(ERROR)s for cases where mandatory support procedures are not defined, transforming the crashes into errors. The particular case that was reported is an incomplete opclass in PostGIS. Backpatch all the way down to 13. Reported-by: Tobias Wendorff <tobias.wendorff@tu-dortmund.de> Diagnosed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/fb6d9a35-6c8e-4869-af80-0a4944a793a4@tu-dortmund.de
* Fix some performance issues in GIN query startup.Tom Lane2025-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a GIN index search had a lot of search keys (for example, "jsonbcol ?| array[]" with tens of thousands of array elements), both ginFillScanKey() and startScanKey() took O(N^2) time. Worse, those loops were uncancelable for lack of CHECK_FOR_INTERRUPTS. The problem in ginFillScanKey() is the brute-force search key de-duplication done in ginFillScanEntry(). The most expedient solution seems to be to just stop trying to de-duplicate once there are "too many" search keys. We could imagine working harder, say by using a sort-and-unique algorithm instead of brute force compare-all-the-keys. But it seems unlikely to be worth the trouble. There is no correctness issue here, since the code already allowed duplicate keys if any extra_data is present. The problem in startScanKey() is the loop that attempts to identify the first non-required search key. In the submitted test case, that vainly tests all the key positions, and each iteration takes O(N) time. One part of that is that it's reinitializing the entryRes[] array from scratch each time, which is entirely unnecessary given that the triConsistentFn isn't supposed to scribble on its input. We can easily adjust the array contents incrementally instead. The other part of it is that the triConsistentFn may itself take O(N) time (and does in this test case). This is all extremely brute force: in simple cases with AND or OR semantics, we could know without any looping whatever that all or none of the keys are required. But GIN opclasses don't have any API for exposing that knowledge, so at least in the short run there is little to be done about that. Put in a CHECK_FOR_INTERRUPTS so that at least the loop is cancelable. These two changes together resolve the primary complaint that the test query doesn't respond promptly to cancel interrupts. Also, while they don't completely eliminate the O(N^2) behavior, they do provide quite a nice speedup for mid-sized examples. Bug: #18831 Reported-by: Niek <niek.brasa@hitachienergy.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18831-e845ac44ebc5dd36@postgresql.org Backpatch-through: 13
* At update of non-LP_NORMAL TID, fail instead of corrupting page header.Noah Misch2025-01-25
| | | | | | | | | | | | | | | | The right mix of DDL and VACUUM could corrupt a catalog page header such that PageIsVerified() durably fails, requiring a restore from backup. This affects only catalogs that both have a syscache and have DDL code that uses syscache tuples to construct updates. One of the test permutations shows a variant not yet fixed. This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with TM_Deleted. I think core and PGXN are indifferent to that. Per bug #17821 from Alexander Lakhin. Back-patch to v13 (all supported versions). The test case is v17+, since it uses INJECTION_POINT. Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
* Fix header check for continuation records where standbys could be stuckMichael Paquier2025-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XLogPageRead() checks immediately for an invalid WAL record header on a standby, to be able to handle the case of continuation records that need to be read across two different sources. As written, the check was too generic, applying to any target LSN. Based on an analysis by Kyotaro Horiguchi, what really matters is to make sure that the page header is checked when attempting to read a LSN at the boundary of a segment, to handle the case of a continuation record that spawns across multiple pages when dealing with multiple segments, as WAL receivers are spawned they request WAL from the beginning of a segment. This fix has been proposed by Kyotaro Horiguchi. This could cause standbys to loop infinitely when dealing with a continuation record during a timeline jump, in the case where the contents of the record in the follow-up page are invalid. Some regression tests are added to check such scenarios, able to reproduce the original problem. In the test, the contents of a continuation record are overwritten with junk zeros on its follow-up page, and replayed on standbys. This is inspired by 039_end_of_wal.pl, and is enough to show how standbys should react on promotion by not being stuck. Without the fix, the test would fail with a timeout. The test to reproduce the problem has been written by Alexander Kukushkin. The original check has been introduced in 066871980183, for a similar problem. Author: Kyotaro Horiguchi, Alexander Kukushkin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com Backpatch-through: 13
* Revert recent changes related to handling of 2PC files at recoveryMichael Paquier2025-01-17
| | | | | | | | | | | | | | | | | | | | This commit reverts 8f67f994e8ea (down to v13) and c3de0f9eed38 (down to v17), as these are proving to not be completely correct regarding two aspects: - In v17 and newer branches, c3de0f9eed38's check for epoch handling is incorrect, and does not correctly handle frozen epochs. A logic closer to widen_snapshot_xid() should be used. The 2PC code should try to integrate deeper with FullTransactionIds, 5a1dfde8334b being not enough. - In v13 and newer branches, 8f67f994e8ea is a workaround for the real issue, which is that we should not attempt CLOG lookups without reaching consistency. This exists since 728bd991c3c4, and this is reachable with ProcessTwoPhaseBuffer() called by restoreTwoPhaseData() at the beginning of recovery. Per discussion with Noah Misch. Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com Backpatch-through: 13
* Fix potential integer overflow in bringetbitmap()Michael Paquier2025-01-14
| | | | | | | | | | | | | | | | This function expects an "int64" as result and stores the number of pages to add to the index scan bitmap as an "int", multiplying its final result by 10. For a relation large enough, this can theoretically overflow if counting more than (INT32_MAX / 10) pages, knowing that the number of pages is upper-bounded by MaxBlockNumber. To avoid the overflow, this commit redefines "totalpages", used to calculate the result, to be an "int64" rather than an "int". Reported-by: Evgeniy Gorbanyov Author: James Hunter Discussion: https://www.postgresql.org/message-id/07704817-6fa0-460c-b1cf-cd18f7647041@basealt.ru Backpatch-through: 13
* Back-patch b1ffe3ff into REL_13_STABLE.Thomas Munro2025-01-09
| | | | | | | | | | | | | | | | | | | | | This is a cherry pick of 4c8e00ae from the 14 branch into the 13 branch. It avoids an assertion failure when ForwardSyncRequest() tries to allocate memory while trying to compact the queue, if run in a critical section. RelationTruncate() gained a critical section in 38c579b0, and could fail in that way in the 13 branch. b1ffe3ff originally fixed the same problem with TruncateMultiXact(), but for that case it only needed to go back as far as 14, where SLRUs started using the sync request queue. It also fixed a related deadlock risk that doesn't apply in this case (this case doesn't wait), but it might exist in theory and it doesn't hurt to keep the code the same as later branches. Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> (original commit) Reviewed-by: Michael Paquier <michael@paquier.xyz> (in this new context) Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru> Discussion: https://postgr.es/m/f98aaa79-80b5-47c9-832a-31416a3a528b%40postgrespro.ru
* Fix handling of orphaned 2PC files in the future at recoveryMichael Paquier2024-12-30
| | | | | | | | | | | | | | | | | | | | | | | | | | Before 728bd991c3c4, that has improved the support for 2PC files during recovery, the initial logic scanning files in pg_twophase was done so as files in the future of the transaction ID horizon were checked first, followed by a check if a transaction ID is aborted or committed which could involve a pg_xact lookup. After this commit, these checks have been done in reverse order. Files detected as in the future do not have a state that can be checked in pg_xact, hence this caused recovery to fail abruptly should an orphaned 2PC file in the future of the transaction ID horizon exist in pg_twophase at the beginning of recovery. A test is added to check for this scenario, using an empty 2PC with a transaction ID large enough to be in the future when running the test. This test is added in 16 and older versions for now. 17 and newer versions are impacted by a second bug caused by the addition of the epoch in the 2PC file names. An equivalent test will be added in these branches in a follow-up commit, once the second set of issues reported are fixed. Author: Vitaly Davydov, Michael Paquier Discussion: https://postgr.es/m/11e597-676ab680-8d-374f23c0@145466129 Backpatch-through: 13
* Exclude parallel workers from connection privilege/limit checks.Tom Lane2024-12-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cause parallel workers to not check datallowconn, rolcanlogin, and ACL_CONNECT privileges. The leader already checked these things (except for rolcanlogin which might have been checked for a different role). Re-checking can accomplish little except to induce unexpected failures in applications that might not even be aware that their query has been parallelized. We already had the principle that parallel workers rely on their leader to pass a valid set of authorization information, so this change just extends that a bit further. Also, modify the ReservedConnections, datconnlimit and rolconnlimit logic so that these limits are only enforced against regular backends, and only regular backends are counted while checking if the limits were already reached. Previously, background processes that had an assigned database or role were subject to these limits (with rather random exclusions for autovac workers and walsenders), and the set of existing processes that counted against each limit was quite haphazard as well. The point of these limits, AFAICS, is to ensure the availability of PGPROC slots for regular backends. Since all other types of processes have their own separate pools of PGPROC slots, it makes no sense either to enforce these limits against them or to count them while enforcing the limit. While edge-case failures of these sorts have been possible for a long time, the problem got a good deal worse with commit 5a2fed911 (CVE-2024-10978), which caused parallel workers to make some of these checks using the leader's current role where before we had used its AuthenticatedUserId, thus allowing parallel queries to fail after SET ROLE. The previous behavior was fairly accidental and I have no desire to return to it. This patch includes reverting 73c9f91a1, which was an emergency hack to suppress these same checks in some cases. It wasn't complete, as shown by a recent bug report from Laurenz Albe. We can also revert fd4d93d26 and 492217301, which hacked around the same problems in one regression test. In passing, remove the special case for autovac workers in CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's needed. Like 5a2fed911, back-patch to supported branches (which sadly no longer includes v12). Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
* Replace durable_rename_excl() by durable_rename(), take twoMichael Paquier2024-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | durable_rename_excl() attempts to avoid overwriting any existing files by using link() and unlink(), and it falls back to rename() on some platforms (aka WIN32), which offers no such overwrite protection. Most callers use durable_rename_excl() just in case there is an existing file, but in practice there shouldn't be one (see below for more details). Furthermore, failures during durable_rename_excl() can result in multiple hard links to the same file. As per Nathan's tests, it is possible to end up with two links to the same file in pg_wal after a crash just before unlink() during WAL recycling. Specifically, the test produced links to the same file for the current WAL file and the next one because the half-recycled WAL file was re-recycled upon restarting, leading to WAL corruption. This change replaces all the calls of durable_rename_excl() to durable_rename(). This removes the protection against accidentally overwriting an existing file, but some platforms are already living without it and ordinarily there shouldn't be one. The function itself is left around in case any extensions are using it. It will be removed on HEAD via a follow-up commit. Here is a summary of the existing callers of durable_rename_excl() (see second discussion link at the bottom), replaced by this commit. First, basic_archive used it to avoid overwriting an archive concurrently created by another server, but as mentioned above, it will still overwrite files on some platforms. Second, xlog.c uses it to recycle past WAL segments, where an overwrite should not happen (origin of the change at f0e37a8) because there are protections about the WAL segment to select when recycling an entry. The third and last area is related to the write of timeline history files. writeTimeLineHistory() will write a new timeline history file at the end of recovery on promotion, so there should be no such files for the same timeline. What remains is writeTimeLineHistoryFile(), that can be used in parallel by a WAL receiver and the startup process, and some digging of the buildfarm shows that EEXIST from a WAL receiver can happen with an error of "could not link file \"pg_wal/xlogtemp.NN\" to \"pg_wal/MM.history\", which would cause an automatic restart of the WAL receiver as it is promoted to FATAL, hence this should improve the stability of the WAL receiver as rename() would overwrite an existing TLI history file already fetched by the startup process at recovery. This is the second time this change is attempted, ccfbd92 being the first one, but this time no assertions are added for the case of a TLI history file written concurrently by the WAL receiver or the startup process because we can expect one to exist (some of the TAP tests are able to trigger with a proper timing). This commit has been originally applied on v16~ as of dac1ff30906b, and we have received more reports of this issue, where clusters can become corrupted at replay in older stable branches with multiple links pointing to the same physical WAL segment file. This backpatch addresses the problem for the v13~v15 range. Author: Nathan Bossart Reviewed-by: Robert Haas, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/20220407182954.GA1231544@nathanxps13 Discussion: https://postgr.es/m/Ym6GZbqQdlalSKSG@paquier.xyz Discussion: https://postgr.es/m/CAJhEC04tBkYPF4q2uS_rCytauvNEVqdBAzasBEokfceFhF=KDQ@mail.gmail.com
* Revert "Handle better implicit transaction state of pipeline mode"Michael Paquier2024-11-28
| | | | | | | | | This reverts commit d77f91214fb7 on all stable branches, due to concerns regarding the compatility side effects this could create in a minor release. The change still exists on HEAD. Discussion: https://postgr.es/m/CA+TgmoZqRgeFTg4+Yf_CMRRXiHuNz1u6ZC4FvVk+rxw0RmOPnw@mail.gmail.com Backpatch-through: 13
* Handle better implicit transaction state of pipeline modeMichael Paquier2024-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using a pipeline, a transaction starts from the first command and is committed with a Sync message or when the pipeline ends. Functions like IsInTransactionBlock() or PreventInTransactionBlock() were already able to understand a pipeline as being in a transaction block, but it was not the case of CheckTransactionBlock(). This function is called for example to generate a WARNING for SET LOCAL, complaining that it is used outside of a transaction block. The current state of the code caused multiple problems, like: - SET LOCAL executed at any stage of a pipeline issued a WARNING, even if the command was at least second in line where the pipeline is in a transaction state. - LOCK TABLE failed when invoked at any step of a pipeline, even if it should be able to work within a transaction block. The pipeline protocol assumes that the first command of a pipeline is not part of a transaction block, and that any follow-up commands is considered as within a transaction block. This commit changes the backend so as an implicit transaction block is started each time the first Execute message of a pipeline has finished processing, with this implicit transaction block ended once a sync is processed. The checks based on XACT_FLAGS_PIPELINING in the routines checking if we are in a transaction block are not necessary: it is enough to rely on the existing ones. Some tests are added to pgbench, that can be backpatched down to v17 when \syncpipeline is involved and down to v14 where \startpipeline and \endpipeline are available. This is unfortunately limited regarding the error patterns that can be checked, but it provides coverage for various pipeline combinations to check if these succeed or fail. These tests are able to capture the case of SET LOCAL's WARNING. The author has proposed a different feature to improve the coverage by adding similar meta-commands to psql where error messages could be checked, something more useful for the cases where commands cannot be used in transaction blocks, like REINDEX CONCURRENTLY or VACUUM. This is considered as future work for v18~. Author: Anthonin Bonnefoy Reviewed-by: Jelte Fennema-Nio, Michael Paquier Discussion: https://postgr.es/m/CAO6_XqrWO8uNBQrSu5r6jh+vTGi5Oiyk4y8yXDORdE2jbzw8xw@mail.gmail.com Backpatch-through: 13
* Fix outdated bit in README.tuplockÁlvaro Herrera2024-11-21
| | | | | | | | | | | | Apparently this information has been outdated since first committed, because we adopted a different implementation during development per reviews and this detail was not updated in the README. This has been wrong since commit 0ac5ad5134f2 introduced the file in 2013. Backpatch to all live branches. Reported-by: Will Mortensen <will@extrahop.com> Discussion: https://postgr.es/m/CAMpnoC6yEQ=c0Rdq-J7uRedrP7Zo9UMp6VZyP23QMT68n06cvA@mail.gmail.com
* Fix improper interactions between session_authorization and role.Tom Lane2024-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SQL spec mandates that SET SESSION AUTHORIZATION implies SET ROLE NONE. We tried to implement that within the lowest-level functions that manipulate these settings, but that was a bad idea. In particular, guc.c assumes that it doesn't matter in what order it applies GUC variable updates, but that was not the case for these two variables. This problem, compounded by some hackish attempts to work around it, led to some security-grade issues: * Rolling back a transaction that had done SET SESSION AUTHORIZATION would revert to SET ROLE NONE, even if that had not been the previous state, so that the effective user ID might now be different from what it had been. * The same for SET SESSION AUTHORIZATION in a function SET clause. * If a parallel worker inspected current_setting('role'), it saw "none" even when it should see something else. Also, although the parallel worker startup code intended to cope with the current role's pg_authid row having disappeared, its implementation of that was incomplete so it would still fail. Fix by fully separating the miscinit.c functions that assign session_authorization from those that assign role. To implement the spec's requirement, teach set_config_option itself to perform "SET ROLE NONE" when it sets session_authorization. (This is undoubtedly ugly, but the alternatives seem worse. In particular, there's no way to do it within assign_session_authorization without incompatible changes in the API for GUC assign hooks.) Also, improve ParallelWorkerMain to directly set all the relevant user-ID variables instead of relying on some of them to get set indirectly. That allows us to survive not finding the pg_authid row during worker startup. In v16 and earlier, this includes back-patching 9987a7bf3 which fixed a violation of GUC coding rules: SetSessionAuthorization is not an appropriate place to be throwing errors from. Security: CVE-2024-10978
* Improve fix for not entering parallel mode when holding interrupts.Tom Lane2024-11-08
| | | | | | | | | | | | | | | | | | | | | | | Commit ac04aa84a put the shutoff for this into the planner, which is not ideal because it doesn't prevent us from re-using a previously made parallel plan. Revert the planner change and instead put the shutoff into InitializeParallelDSM, modeling it on the existing code there for recovering from failure to allocate a DSM segment. However, that code path is mostly untested, and testing a bit harder showed there's at least one bug: ExecHashJoinReInitializeDSM is not prepared for us to have skipped doing parallel DSM setup. I also thought the Assert in ReinitializeParallelWorkers is pretty ill-advised, and replaced it with a silent Min() operation. The existing test case added by ac04aa84a serves fine to test this version of the fix, so no change needed there. Patch by me, but thanks to Noah Misch for the core idea that we could shut off worker creation when !INTERRUPTS_CAN_BE_PROCESSED. Back-patch to v12, as ac04aa84a was. Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
* Revert "For inplace update, send nontransactional invalidations."Noah Misch2024-11-02
| | | | | | | | | | | | This reverts commit 95c5acb3fc261067ab65ddc0b2dca8e162f09442 (v17) and counterparts in each other non-master branch. If released, that commit would have caused a worst-in-years minor release regression, via undetected LWLock self-deadlock. This commit and its self-deadlock fix warrant more bake time in the master branch. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
* Revert "WAL-log inplace update before revealing it to other sessions."Noah Misch2024-11-02
| | | | | | | | This reverts commit bfd5c6e279c8e1702eea882439dc7ebdf4d4b3a5 (v17) and counterparts in each other non-master branch. This unblocks reverting a commit on which it depends. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
* Unpin buffer before inplace update waits for an XID to end.Noah Misch2024-10-29
| | | | | | | | | | | | Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. By keeping the pin during that wait, a sequence of autovacuum workers and an uncommitted GRANT starved one foreground LockBufferForCleanup() for six minutes, on buildfarm member sarus. Prevent, at the cost of a bit of complexity. Back-patch to v12, like the earlier commit. That commit and heap_inplace_lock() have not yet appeared in any release. Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
* WAL-log inplace update before revealing it to other sessions.Noah Misch2024-10-25
| | | | | | | | | | | | A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v12 (all supported versions). In v14 and earlier, this also back-patches the assertion removal from commit 7fcf2faf9c7dd473208fd6d5565f88d7f733782b. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com
* For inplace update, send nontransactional invalidations.Noah Misch2024-10-25
| | | | | | | | | | | | | | | | The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
* At end of recovery, reset all sinval-managed caches.Noah Misch2024-10-25
| | | | | | | | | | | | | | | | An inplace update's invalidation messages are part of its transaction's commit record. However, the update survives even if its transaction aborts or we stop recovery before replaying its transaction commit. After recovery, a backend that started in recovery could update the row without incorporating the inplace update. That could result in a table with an index, yet relhasindex=f. That is a source of index corruption. This bulk invalidation avoids the functional consequences. A future change can fix the !RecoveryInProgress() scenario without changing the WAL format. Back-patch to v17 - v12 (all supported versions). v18 will instead add invalidations to WAL. Discussion: https://postgr.es/m/20240618152349.7f.nmisch@google.com
* Stop reading uninitialized memory in heap_inplace_lock().Noah Misch2024-10-24
| | | | | | | | | | Stop computing a never-used value. This removes the read; the read had no functional implications. Back-patch to v12, like commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com
* Fix race condition in COMMIT PREPARED causing orphaned 2PC filesMichael Paquier2024-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMMIT PREPARED removes on-disk 2PC files near its end, but the state checked if a file is on-disk or not gets read from shared memory while not holding the two-phase state lock. Because of that, there was a small window where a second backend doing a PREPARE TRANSACTION could reuse the GlobalTransaction put back into the 2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read afterwards by the COMMIT PREPARED to decide if its on-disk two-phase state file should be removed, preventing the file deletion. This commit fixes this issue so as the "ondisk" flag in the GlobalTransaction is read while holding the two-phase state lock, not from shared memory after its entry has been added to the free list. Orphaned two-phase state files flushed to disk after a checkpoint are discarded at the beginning of recovery. However, a truncation of pg_xact/ would make the startup process issue a FATAL when it cannot read the SLRU page holding the state of the transaction whose 2PC file was orphaned, which is a necessary step to decide if the 2PC file should be removed or not. Removing manually the file would be necessary in this case. Issue introduced by effe7d9552dd, so backpatch all the way down. Mea culpa. Author: wuchengwen Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com Backpatch-through: 12
* For inplace update durability, make heap_update() callers wait.Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
* Fix data loss at inplace update after heap_update().Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | | | | As previously-added tests demonstrated, heap_inplace_update() could instead update an unrelated tuple of the same catalog. It could lose the update. Losing relhasindex=t was a source of index corruption. Inplace-updating commands like VACUUM will now wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a long-running GRANT already hurts VACUUM progress more just by keeping an XID running. The VACUUM will behave like a DELETE or UPDATE waiting for the uncommitted change. For implementation details, start at the systable_inplace_update_begin() header comment and README.tuplock. Back-patch to v12 (all supported versions). In back branches, retain a deprecated heap_inplace_update(), for extensions. Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier versions) Heikki Linnakangas, and (in earlier versions) Alexander Lakhin. Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
* Allow _h_indexbuild() to be interrupted.Tom Lane2024-09-13
| | | | | | | | | | | | | | When we are building a hash index that is large enough to need pre-sorting (larger than either maintenance_work_mem or NBuffers), the initial sorting phase is interruptible, but the insertion phase wasn't. Add the missing CHECK_FOR_INTERRUPTS(). Per bug #18616 from Alexander Lakhin. Back-patch to all supported branches. Pavel Borisov Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org
* Revert "Allow parallel workers to cope with a newly-created session user ID."Tom Lane2024-07-31
| | | | | | | | | | | | | | | | This reverts commit 216201027d90e99a0a2b2d2efba85dc0aac94c62. Some buildfarm animals are failing with "cannot change "client_encoding" during a parallel operation". It looks like assign_client_encoding is unhappy at being asked to roll back a client_encoding setting after a parallel worker encounters a failure. There must be more to it though: why didn't I see this during local testing? In any case, it's clear that moving the RestoreGUCState() call is not as side-effect-free as I thought. Given that the bug f5f30c22e intended to fix has gone unreported for years, it's not something that's urgent to fix; I'm not willing to risk messing with it further with only days to our next release wrap.
* Allow parallel workers to cope with a newly-created session user ID.Tom Lane2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | | Parallel workers failed after a sequence like BEGIN; CREATE USER foo; SET SESSION AUTHORIZATION foo; because check_session_authorization could not see the uncommitted pg_authid row for "foo". This is because we ran RestoreGUCState() in a separate transaction using an ordinary just-created snapshot. The same disease afflicts any other GUC that requires catalog lookups and isn't forgiving about the lookups failing. To fix, postpone RestoreGUCState() into the worker's main transaction after we've set up a snapshot duplicating the leader's. This affects check_transaction_isolation and check_transaction_deferrable, which think they should only run during transaction start. Make them act like check_transaction_read_only, which already knows it should silently accept the value when InitializingParallelWorker. Per bug #18545 from Andrey Rachitskiy. Back-patch to all supported branches, because this has been wrong for awhile. Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
* Fix MVCC bug with prepared xact with subxacts on standbyHeikki Linnakangas2024-06-27
| | | | | | | | | | | | | | | | | | We did not recover the subtransaction IDs of prepared transactions when starting a hot standby from a shutdown checkpoint. As a result, such subtransactions were considered as aborted, rather than in-progress. That would lead to hint bits being set incorrectly, and the subtransactions suddenly becoming visible to old snapshots when the prepared transaction was committed. To fix, update pg_subtrans with prepared transactions's subxids when starting hot standby from a shutdown checkpoint. The snapshots taken from that state need to be marked as "suboverflowed", so that we also check the pg_subtrans. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
* Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY.Tom Lane2024-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may insert some REDIRECT tuples. This will typically happen in a transaction that lacks an XID, which leads either to assertion failure in spgFormDeadTuple or to insertion of a REDIRECT tuple with zero xid. The latter's not good either, since eventually VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid, resulting in either an assertion failure or a garbage answer. In practice, since REINDEX CONCURRENTLY locks out index scans till it's done, it doesn't matter whether it inserts REDIRECTs or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM reduces such a REDIRECT to a PLACEHOLDER. So in non-assert builds there's no observable problem here, other than perhaps a little index bloat. But it's not behaving as intended. To fix, remove the failing Assert in spgFormDeadTuple, acknowledging that we might sometimes insert a zero XID; and guard VACUUM's GlobalVisTestIsRemovableXid() call with a test for valid XID, ensuring that we'll reduce such a REDIRECT the first time VACUUM sees it. (Versions before v14 use TransactionIdPrecedes here, which won't fail on zero xid, so they really have no bug at all in non-assert builds.) Another solution could be to not create REDIRECTs at all during REINDEX CONCURRENTLY, making the relevant code paths treat that case like index build (which likewise knows that no concurrent index scans can be happening). That would allow restoring the Assert in spgFormDeadTuple, but we'd still need the VACUUM change because redirection tuples with zero xid may be out there already. But there doesn't seem to be a nice way for spginsert() to tell that it's being called in REINDEX CONCURRENTLY without some API changes, so we'll leave that as a possible future improvement. In HEAD, also rename the SpGistState.myXid field to redirectXid, which seems less misleading (since it might not in fact be our transaction's XID) and is certainly less uninformatively generic. Per bug #18499 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org
* Clamp result of MultiXactMemberFreezeThresholdHeikki Linnakangas2024-06-13
| | | | | | | | | | | | The purpose of the function is to reduce the effective autovacuum_multixact_freeze_max_age if the multixact members SLRU is approaching wraparound, to make multixid freezing more aggressive. The returned value should therefore never be greater than plain autovacuum_multixact_freeze_max_age. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/85fb354c-f89f-4d47-b3a2-3cbd461c90a3@iki.fi Backpatch-through: 12, all supported versions
* Promote assertion about !ReindexIsProcessingIndex to runtime error.Tom Lane2024-02-25
| | | | | | | | | | | | | | | | | | | | | When this assertion was installed (in commit d2f60a3ab), I thought it was only for catching server logic errors that caused accesses to catalogs that were undergoing index rebuilds. However, it will also fire in case of a user-defined index expression that attempts to access its own table. We occasionally see reports of people trying to do that, and typically getting unintelligible low-level errors as a result. We can provide a more on-point message by making this a regular runtime check. While at it, adjust the similar error check in systable_beginscan_ordered to use the same message text. That one is (probably) not reachable without a coding bug, but we might as well use a translatable message if we have one. Per bug #18363 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18363-e3598a5a572d0699@postgresql.org
* Fix locking when fixing an incomplete split of a GIN internal pageHeikki Linnakangas2024-01-29
| | | | | | | | | | | | | | | | | ginFinishSplit() expects the caller to hold an exclusive lock on the buffer, but when finishing an earlier "leftover" incomplete split of an internal page, the caller held a shared lock. That caused an assertion failure in MarkBufferDirty(). Without assertions, it could lead to corruption if two backends tried to complete the split at the same time. On master, add a test case using the new injection point facility. Report and analysis by Fei Changhong. Backpatch the fix to all supported versions. Reviewed-by: Fei Changhong, Michael Paquier Discussion: https://www.postgresql.org/message-id/tencent_A3CE810F59132D8E230475A5F0F7A08C8307@qq.com
* lwlock: Fix quadratic behavior with very long wait listsAndres Freund2024-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now LWLockDequeueSelf() sequentially searched the list of waiters to see if the current proc is still is on the list of waiters, or has already been removed. In extreme workloads, where the wait lists are very long, this leads to a quadratic behavior. #backends iterating over a list #backends long. Additionally, the likelihood of needing to call LWLockDequeueSelf() in the first place also increases with the increased length of the wait queue, as it becomes more likely that a lock is released while waiting for the wait list lock, which is held for longer during lock release. Due to the exponential back-off in perform_spin_delay() this is surprisingly hard to detect. We should make that easier, e.g. by adding a wait event around the pg_usleep() - but that's a separate patch. The fix is simple - track whether a proc is currently waiting in the wait list or already removed but waiting to be woken up in PGPROC->lwWaiting. In some workloads with a lot of clients contending for a small number of lwlocks (e.g. WALWriteLock), the fix can substantially increase throughput. This has been originally fixed for 16~ with a4adc31f6902 without a backpatch, and we have heard complaints from users impacted by this quadratic behavior in older versions as well. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/20221027165914.2hofzp4cvutj6gin@awork3.anarazel.de Discussion: https://postgr.es/m/CALj2ACXktNbG=K8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag@mail.gmail.com Backpatch-through: 12
* Avoid trying to fetch metapage of an SPGist partitioned index.Tom Lane2023-12-21
| | | | | | | | | | | | | | | | | | | This is necessary when spgcanreturn() is invoked on a partitioned index, and the failure might be reachable in other scenarios as well. The rest of what spgGetCache() does is perfectly sensible for a partitioned index, so we should allow it to go through. I think the main takeaway from this is that we lack sufficient test coverage for non-btree partitioned indexes. Therefore, I added simple test cases for brin and gin as well as spgist (hash and gist AMs were covered already in indexing.sql). Per bug #18256 from Alexander Lakhin. Although the known test case only fails since v16 (3c569049b), I've got no faith at all that there aren't other ways to reach this problem; so back-patch to all supported branches. Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
* Prevent tuples to be marked as dead in subtransactions on standbysMichael Paquier2023-12-12
| | | | | | | | | | | | | | | | | | | | | | | Dead tuples are ignored and are not marked as dead during recovery, as it can lead to MVCC issues on a standby because its xmin may not match with the primary. This information is tracked by a field called "xactStartedInRecovery" in the transaction state data, switched on when starting a transaction in recovery. Unfortunately, this information was not correctly tracked when starting a subtransaction, because the transaction state used for the subtransaction did not update "xactStartedInRecovery" based on the state of its parent. This would cause index scans done in subtransactions to return inconsistent data, depending on how the xmin of the primary and/or the standby evolved. This is broken since the introduction of hot standby in efc16ea52067, so backpatch all the way down. Author: Fei Changhong Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com Backpatch-through: 12
* Fix compilation on Windows with WAL_DEBUGMichael Paquier2023-12-06
| | | | | | | | | | This has been broken since b060dbe0001a that has reworked the callback mechanism of XLogReader, most likely unnoticed because any form of development involving WAL happens on platforms where this compiles fine. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com Backpatch-through: 13
* Fix assertions with RI triggers in heap_update and heap_delete.Heikki Linnakangas2023-11-28
| | | | | | | | | | | | If the tuple being updated is not visible to the crosscheck snapshot, we return TM_Updated but the assertions would not hold in that case. Move them to before the cross-check. Fixes bug #17893. Backpatch to all supported versions. Author: Alexander Lakhin Backpatch-through: 12 Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org
* Don't release index root page pin in ginFindParents().Tom Lane2023-11-13
| | | | | | | | | | | | | | | | | | | | | It's clearly stated in the comments that ginFindParents() must keep the pin on the index's root page that's associated with the topmost GinBtreeStack item. However, the code path for the case that the desired downlink has been pushed down to the next index level ignored this proviso, and would release the pin anyway if we were still examining the root level. That led to an assertion failure or "buffer NNNN is not owned by resource owner" error later, when we try to release the pin again at the end of the insertion. This is quite hard to reproduce, since it can only happen if an index root page split occurs concurrently with our own insertion. Thanks to Jeff Janes for finding a test case that triggers it often enough to allow investigation. This has been there since the beginning of GIN, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAMkU=1yCAKtv86dMrD__Ja-7KzjE=uMeKX8y__cx5W-OEWy2ow@mail.gmail.com
* Ensure we use the correct spelling of "ensure"David Rowley2023-11-10
| | | | | | | | | We seem to have accidentally used "insure" in a few places. Correct that. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv0biqrhA3pMhu40aDsj343mTsD75khKnHsLqR8P04f=Q@mail.gmail.com Backpatch-through: 12, oldest supported version
* doc: 1-byte varlena headers can be used for user PLAIN storageBruce Momjian2023-10-31
| | | | | | | | | | | | This also updates some C comments. Reported-by: suchithjn22@gmail.com Discussion: https://postgr.es/m/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org Author: Laurenz Albe (doc patch) Backpatch-through: 11