postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix race condition in COMMIT PREPARED causing orphaned 2PC files	Michael Paquier	2024-10-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	COMMIT PREPARED removes on-disk 2PC files near its end, but the state checked if a file is on-disk or not gets read from shared memory while not holding the two-phase state lock. Because of that, there was a small window where a second backend doing a PREPARE TRANSACTION could reuse the GlobalTransaction put back into the 2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read afterwards by the COMMIT PREPARED to decide if its on-disk two-phase state file should be removed, preventing the file deletion. This commit fixes this issue so as the "ondisk" flag in the GlobalTransaction is read while holding the two-phase state lock, not from shared memory after its entry has been added to the free list. Orphaned two-phase state files flushed to disk after a checkpoint are discarded at the beginning of recovery. However, a truncation of pg_xact/ would make the startup process issue a FATAL when it cannot read the SLRU page holding the state of the transaction whose 2PC file was orphaned, which is a necessary step to decide if the 2PC file should be removed or not. Removing manually the file would be necessary in this case. Issue introduced by effe7d9552dd, so backpatch all the way down. Mea culpa. Author: wuchengwen Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com Backpatch-through: 12
*	Add num_done counter to the pg_stat_checkpointer view.	Fujii Masao	2024-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Checkpoints can be skipped when the server is idle. The existing num_timed and num_requested counters in pg_stat_checkpointer track both completed and skipped checkpoints, but there was no way to count only the completed ones. This commit introduces the num_done counter, which tracks only completed checkpoints, making it easier to see how many were actually performed. Bump catalog version. Author: Anton A. Melnikov Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
*	Set query ID in parallel workers for vacuum, BRIN and btree	Michael Paquier	2024-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All these code paths use their own entry point when starting parallel workers, but failed to set a query ID, even if they set a text query. Hence, this data would be missed in pg_stat_activity for the worker processes. The main entry point for parallel query processing, ParallelQueryMain(), is already doing that by saving its query ID in a dummy PlannedStmt, but not the others. The code is changed so as the query ID of these queries is set in their shared state, and reported back once the parallel workers start. Some tests are added to show how the failures can happen for btree and BRIN with a parallel build enforced, which are able to trigger a failure in an assertion added by 24f520594809 in the recovery TAP test 027_stream_regress.pl where pg_stat_statements is always loaded. In this case, the executor path was taken because the index expression needs to be flattened when building its IndexInfo. Alexander Lakhin has noticed the problem in btree, and I have noticed that the issue was more spread. This is arguably a bug, but nobody has complained about that until now, so no backpatch is done out of caution. If folks would like to see a backpatch, well, let me know. Reported-by: Alexander Lakhin Reviewed-by: Sami Imseih Discussion: https://postgr.es/m/cf3547c1-498a-6a61-7b01-819f902a251f@gmail.com
*	For inplace update durability, make heap_update() callers wait.	Noah Misch	2024-09-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
*	Fix data loss at inplace update after heap_update().	Noah Misch	2024-09-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As previously-added tests demonstrated, heap_inplace_update() could instead update an unrelated tuple of the same catalog. It could lose the update. Losing relhasindex=t was a source of index corruption. Inplace-updating commands like VACUUM will now wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a long-running GRANT already hurts VACUUM progress more just by keeping an XID running. The VACUUM will behave like a DELETE or UPDATE waiting for the uncommitted change. For implementation details, start at the systable_inplace_update_begin() header comment and README.tuplock. Back-patch to v12 (all supported versions). In back branches, retain a deprecated heap_inplace_update(), for extensions. Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier versions) Heikki Linnakangas, and (in earlier versions) Alexander Lakhin. Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
*	Update obsolete nbtree array preprocessing comments.	Peter Geoghegan	2024-09-24
\| \| \| \| \| \| \| \| \| \| \| \|	The array->scan_key references fixed up at the end of preprocessing start out as offsets into the arrayKeyData[] array (the array returned by _bt_preprocess_array_keys at the start of preprocessing that involves array scan keys). Offsets into the arrayKeyData[] array are no longer guaranteed to be valid offsets into our original scan->keyData[] input scan key array, but comments describing the array->scan_key references still talked about scan->keyData[]. Update those comments. Oversight in commit b5249741.
*	Refactor handling of nbtree array redundancies.	Peter Geoghegan	2024-09-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach _bt_preprocess_array_keys to eliminate redundant array equality scan keys directly, rather than just marking them as redundant. Its _bt_preprocess_keys caller is no longer required to ignore input scan keys that were marked redundant in this way. Oversights like the one fixed by commit f22e17f7 are no longer possible. The new scheme also makes it easier for _bt_preprocess_keys to output a so.keyData[] scan key array with _more_ scan keys than it was passed in its scan.keyData[] input scan key array. An upcoming patch that adds skip scan optimizations to nbtree will take advantage of this. In passing, remove and rename certain _bt_preprocess_keys variables to make the difference between our input scan key array and our output scan key array clearer. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com
*	Improve Asserts checking relation matching in parallel scans.	Tom Lane	2024-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	table_beginscan_parallel and index_beginscan_parallel contain Asserts checking that the relation a worker will use in a parallel scan is the same one the leader intended. However, they were checking for relation OID match, which was not strong enough to detect the mismatch problem fixed in 126ec0bc7. What would be strong enough is to compare relfilenodes instead. Arguably, that's a saner definition anyway, since a scan surely operates on a physical relation not a logical one. Hence, store and compare RelFileLocators not relation OIDs. Also ensure that index_beginscan_parallel checks the index identity not just the table identity. Discussion: https://postgr.es/m/2127254.1726789524@sss.pgh.pa.us
*	Fix nbtree pgstats accounting with parallel scans.	Peter Geoghegan	2024-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made parallel index scans work with the new design for arrays via explicit scheduling of primitive index scans. Under this scheme a parallel index scan with array keys will perform the same number of index descents as an equivalent serial index scan (barring corner cases where an individual parallel worker discovers that it can advance the scan's array keys without anybody needing to perform another descent of the index to get to the relevant page on the leaf level). Despite all this, the pgstats accounting wasn't updated; it continued to increment the total number of index scans for the rel once per _bt_first call, no matter the details. As a result, the number of (primitive) index scans could be over-counted during parallel scans. To fix, delay incrementing the count of index scans until after we've established that another descent of the index (using either _bt_search or _bt_endpoint) is required. That way pg_stat_user_tables.idx_scan always advances in the same way, regardless of whether or not the scan makes use of parallelism. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=E7XrkvscBN0U6V81NK3Q-dQOmivvbEsjG-zwEfDdFpg@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com Backpatch: 17-, where nbtree SAOP execution was enhanced.
*	Restore relmapper state early enough in parallel workers.	Tom Lane	2024-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to do RestoreRelationMap before loading catalog-derived state, else the worker may end up with catalog relcache entries containing stale relfilenode data. Move up RestoreReindexState too, on the principle that that should also happen before we do much of any catalog access. I think ideally these things would happen even before InitPostgres, but there are various problems standing in the way of that, notably that the relmapper thinks "active" mappings should be discarded at transaction end. The implication of this is that InitPostgres and RestoreLibraryState will see the same catalog state as an independent backend would see, which is probably fine; at least, it's been like that all along. Per report from Justin Pryzby. There is a case to be made that this should be back-patched. But given the lack of complaints before 6e086fa2e and the short amount of time remaining before 17.0 wraps, I'll just put it in HEAD for now. Discussion: https://postgr.es/m/ZuoU_8EbSTE14o1U@pryzbyj2023
*	Move pg_wal_replay_wait() to xlogfuncs.c	Alexander Korotkov	2024-09-19
\| \| \| \| \| \| \| \| \| \| \|	This commit moves pg_wal_replay_wait() procedure to be a neighbor of WAL-related functions in xlogfuncs.c. The implementation of LSN waiting continues to reside in the same place. By proposal from Michael Paquier. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
*	Extend PgStat_HashKey.objid from 4 to 8 bytes	Michael Paquier	2024-09-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This opens the possibility to define keys for more types of statistics kinds in PgStat_HashKey, the first case being 8-byte query IDs for statistics like pg_stat_statements. This increases the size of PgStat_HashKey from 12 to 16 bytes, while PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps the same size due to alignment. xl_xact_stats_item, that tracks the stats items to drop in commit WAL records, is increased from 12 to 16 bytes. Note that individual chunks in commit WAL records should be multiples of sizeof(int), hence 8-byte object IDs are stored as two uint32, based on a suggestion from Heikki Linnakangas. While on it, the field of PgStat_HashKey is renamed from "objoid" to "objid", as for some stats kinds this field does not refer to OIDs but just IDs, like for replication slot stats. This commit bumps the following format variables: - PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file for non-serialized stats kinds in the dshash table. - XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item. - Catalog version, for the SQL function pg_stat_have_stats(). Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/ZsvTS9EW79Up8I62@paquier.xyz
*	Minor cleanup related to pg_wal_replay_wait() procedure	Alexander Korotkov	2024-09-17
\| \| \| \| \| \| \| \| \| \| \| \|	* Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there is only one standby. * Remove useless debug printing in 043_wal_replay_wait.pl. * Fix typo in one check description in 043_wal_replay_wait.pl. * Fix some wording in comments and documentation. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com Reviewed-by: Alexander Lakhin
*	Avoid parallel nbtree index scan hangs with SAOPs.	Peter Geoghegan	2024-09-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made parallel index scans work with the new design for arrays via explicit scheduling of primitive index scans. A backend that successfully scheduled the scan's next primitive index scan saved its backend local array keys in shared memory. Any backend could pick up the scheduled primitive scan within _bt_first. This scheme decouples scheduling a primitive scan from starting the scan (by performing another descent of the index via a _bt_search call from _bt_first) to make things robust. The scheme had a deadlock hazard, at least when the leader process participated in the scan. _bt_parallel_seize had a code path that made backends that were not in an immediate position to start a scheduled primitive index scan wait for some other backend to do so instead. Under the right circumstances, the leader process could wait here forever: the leader would wait for any other backend to start the primitive scan, while every worker was busy waiting on the leader to consume tuples from the scan's tuple queue. To fix, don't wait for a scheduled primitive index scan to be started by some other eligible backend from within _bt_parallel_seize (when the calling backend isn't in a position to do so itself). Return false instead, while recording that the scan has a scheduled primitive index scan in backend local state. This leaves the backend in the same state as the existing case where a backend schedules (or tries to schedule) another primitive index scan from within _bt_advance_array_keys, before calling _bt_parallel_seize. _bt_parallel_seize already handles that case by returning false without waiting, and without unsetting the backend local state. Leaving the backend in this state enables it to start a previously scheduled primitive index scan once it gets back to _bt_first. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Matthias van de Meent, with tweaks by me. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reported-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzmMGaPa32u9x_FvEbPTUkP5e95i=QxR8054nvCRydP-sw@mail.gmail.com Backpatch: 17-, where nbtree SAOP execution was enhanced.
*	Add temporal PRIMARY KEY and UNIQUE constraints	Peter Eisentraut	2024-09-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints. These are backed by GiST indexes instead of B-tree indexes, since they are essentially exclusion constraints with = for the scalar parts of the key and && for the temporal part. (previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new part is this:) Because 'empty' && 'empty' is false, the temporal PK/UQ constraint allowed duplicates, which is confusing to users and breaks internal expectations. For instance, when GROUP BY checks functional dependencies on the PK, it allows selecting other columns from the table, but in the presence of duplicate keys you could get the value from any of their rows. So we need to forbid empties. This all means that at the moment we can only support ranges and multiranges for temporal PK/UQs, unlike the original patch (above). Documentation and tests for this are added. But this could conceivably be extended by introducing some more general support for the notion of "empty" for other types. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
*	Add stratnum GiST support function	Peter Eisentraut	2024-09-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is support function 12 for the GiST AM and translates "well-known" RT*StrategyNumber values into whatever strategy number is used by the opclass (since no particular numbers are actually required). We will use this to support temporal PRIMARY KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality. This commit adds two implementations, one for internal GiST opclasses (just an identity function) and another for btree_gist opclasses. It updates btree_gist from 1.7 to 1.8, adding the support function for all its opclasses. (previously committed as 6db4598fcb8, reverted by 8aee330af55; this is essentially unchanged from those) Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
*	Allow _h_indexbuild() to be interrupted.	Tom Lane	2024-09-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we are building a hash index that is large enough to need pre-sorting (larger than either maintenance_work_mem or NBuffers), the initial sorting phase is interruptible, but the insertion phase wasn't. Add the missing CHECK_FOR_INTERRUPTS(). Per bug #18616 from Alexander Lakhin. Back-patch to all supported branches. Pavel Borisov Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org
*	Simplify checks for deterministic collations.	Jeff Davis	2024-09-12
\| \| \| \| \| \| \| \| \| \| \| \|	Remove redundant checks for locale->collate_is_c now that we always have a valid pg_locale_t. Also, remove pg_locale_deterministic() wrapper, which is no longer useful after commit e9931bfb75. Just check the field directly, consistent with other fields in pg_locale_t. Author: Andreas Karlsson Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
*	Remove hardcoded hash opclass function signature exceptions	Peter Eisentraut	2024-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	hashvalidate(), which validates the signatures of support functions for the hash AM, contained several hardcoded exceptions. For example, hash/date_ops support function 1 was hashint4(), which would ordinarily fail validation because the function argument is int4, not date. But this works internally because int4 and date are of the same size. There are several more exceptions like this that happen to work and were allowed historically but would now fail the function signature validation. This patch removes those exceptions by providing new support functions that have the proper declared signatures. They internally share most of the code with the "wrong" functions they replace, so the behavior is still the same. With the exceptions gone, hashvalidate() is now simplified and relies fully on check_amproc_signature(). hashvarlena() and hashvarlenaextended() are kept in pg_proc.dat because some extensions currently use them to build hash functions for their own types, and we need to keep exposing these functions as "LANGUAGE internal" functions for that to continue to work. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/29c3b746-69e7-482a-b37c-dbbf7e5b009b@eisentraut.org
*	Don't overwrite scan key in systable_beginscan()	Peter Eisentraut	2024-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When systable_beginscan() and systable_beginscan_ordered() choose an index scan, they remap the attribute numbers in the passed-in scan keys to the attribute numbers of the index, and then write those remapped attribute numbers back into the scan key passed by the caller. This second part is surprising and gratuitous. It means that a scan key cannot safely be used more than once (but it might sometimes work, depending on circumstances). Also, there is no value in providing these remapped attribute numbers back to the caller, since they can't do anything with that. Fix that by making a copy of the scan keys passed by the caller and make the modifications there. Also, some code that had to work around the previous situation is simplified. Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
*	Move logic related to WAL replay of Heap/Heap2 into its own file	Michael Paquier	2024-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	This brings more clarity to heapam.c, by cleanly separating all the logic related to WAL replay and the rest of Heap and Heap2, similarly to other RMGRs like hash, btree, etc. The header reorganization is also nice in heapam.c, cutting half of the headers required. Author: Li Yong Reviewed-by: Sutou Kouhei, Michael Paquier Discussion: https://postgr.es/m/EFE55E65-D7BD-4C6A-B630-91F43FD0771B@ebay.com
*	Replace gratuitous memmove() with memcpy()	Peter Eisentraut	2024-09-11
\| \| \| \| \| \| \| \| \| \|	The index access methods all had similar code that copied the passed-in scan keys to local storage. They all used memmove() for that, which is not wrong, but it seems confusing not to use memcpy() when that would work. Presumably, this was all once copied from ancient code and never adjusted. Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
*	Add amgettreeheight index AM API routine	Peter Eisentraut	2024-09-10
\| \| \| \| \| \| \| \| \| \| \|	The only current implementation is for btree where it calls _bt_getrootheight(). Other index types can now also use this to pass information to their amcostestimate routine. Previously, btree was hardcoded and other index types could not hook into the optimizer at this point. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
*	Remove emode argument from XLogFileRead() and XLogFileReadAnyTLI()	Michael Paquier	2024-09-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes the code slightly easier to reason about, because there is actually no need to know if a specific caller of one of these routines should fail hard on a PANIC, or just let it go through with a DEBUG2. The only caller of XLogFileReadAnyTLI() used DEBUG2, and XLogFileRead() has never used its emode. This can be simplified since 1bb2558046cc that has introduced XLogFileReadAnyTLI(), splitting both. Author: Yugo Nagata Discussion: https://postgr.es/m/20240906201043.a640f3b44e755d4db2b6943e@sraoss.co.jp
*	Remove lc_collate_is_c().	Jeff Davis	2024-09-04
\| \| \| \| \| \| \|	Instead just look up the collation and check collate_is_c field. Author: Andreas Karlsson Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
*	Unify some error messages to ease work of translators	Michael Paquier	2024-09-04
\| \| \| \| \| \| \| \| \|	This commit updates a couple of error messages around control file data, GUCs and server settings, unifying to the same message where possible. This reduces the translation burden a bit. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
*	Apply more quoting to GUC names in messages	Michael Paquier	2024-09-04
\| \| \| \| \| \| \| \| \|	This is a continuation of 17974ec25946. More quotes are applied to GUC names in error messages and hints, taking care of what seems to be all the remaining holes currently in the tree for the GUCs. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
*	Standardize "read-ahead advice" terminology.	Thomas Munro	2024-09-04
\| \| \| \| \| \| \| \| \|	Commit 6654bb920 added macOS's equivalent of POSIX_FADV_WILLNEED, and changed some explicit references to posix_fadvise to use this more general name for the concept. Update some remaining references. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0827edec-1317-4917-a186-035eb1e3241d%40eisentraut.org
*	Add const qualifiers to XLogRegister*() functions	Peter Eisentraut	2024-09-03
\| \| \| \| \| \| \| \|	Add const qualifiers to XLogRegisterData() and XLogRegisterBufData(). Several unconstify() calls can be removed. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org
*	Fix typos and grammar in code comments and docs	Michael Paquier	2024-09-03
\| \| \| \| \|	Author: Alexander Lakhin Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
*	Define PG_TBLSPC_DIR for path pg_tblspc/ in data folder	Michael Paquier	2024-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similarly to 2065ddf5e34c, this introduces a define for "pg_tblspc". This makes the style more consistent with the existing PG_STAT_TMP_DIR, for example. There is a difference with the other cases with the introduction of PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
*	Define PG_LOGICAL_DIR for path pg_logical/ in data folder	Michael Paquier	2024-08-30
\| \| \| \| \| \| \| \| \| \|	This is similar to 2065ddf5e34c, but this time for pg_logical/ itself and its contents, like the paths for snapshots, mappings or origin checkpoints. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
*	Message style improvements	Peter Eisentraut	2024-08-29
\|
*	Fix nbtree lookahead overflow bug.	Peter Geoghegan	2024-08-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add bounds checking to nbtree's lookahead/skip-within-a-page mechanism. Otherwise it's possible for cases with lots of before-array-keys tuples to overflow an int16 variable, causing the mechanism to generate an out of bounds page offset number. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Reported-By: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/6c68ac42-bbb5-8b24-103e-af0e279c536f@gmail.com Backpatch: 17-, where nbtree SAOP execution was enhanced.
*	Rework new SLRU test with injection points	Michael Paquier	2024-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than the SQL injection_points_load(), this commit changes the injection point test introduced in 768a9fd5535f to rely on the two macros INJECTION_POINT_LOAD() and INJECTION_POINT_CACHED(), that have been originally introduced for the sake of this test. This runs the test as a two-step process: load the injection point, then run its callback directly from the local cache loaded. What the test did originally was also fine, but the point here is to have an example in core of how to use these new macros. While on it, fix the header ordering in multixact.c, as pointed out by Alexander Korotkov. This was an oversight in 768a9fd5535f. Per discussion with Álvaro Herrera. Author: Michael Paquier Discussion: https://postgr.es/m/ZsUnJUlSOBNAzwW1@paquier.xyz Discussion: https://postgr.es/m/CAPpHfduzaBz7KMhwuVOZMTpG=JniPG4aUosXPZCxZydmzq_oEQ@mail.gmail.com
*	Add injection-point test for new multixact CV usage	Alvaro Herrera	2024-08-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before commit a0e0fb1ba56f, multixact.c contained a case in the multixact-read path where it would loop sleeping 1ms each time until another multixact-create path completed, which was uncovered by any tests. That commit changed the code to rely on a condition variable instead. Add a test now, which relies on injection points and "loading" thereof (because of it being in a critical section), per commit 4b211003ecc2. Author: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Michaël Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru
*	Log the conflicts while applying changes in logical replication.	Amit Kapila	2024-08-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides the additional logging information in the following conflict scenarios while applying changes: insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint. update_differ: Updating a row that was previously modified by another origin. update_exists: The updated row value violates a NOT DEFERRABLE unique constraint. update_missing: The tuple to be updated is missing. delete_differ: Deleting a row that was previously modified by another origin. delete_missing: The tuple to be deleted is missing. For insert_exists and update_exists conflicts, the log can include the origin and commit timestamp details of the conflicting key with track_commit_timestamp enabled. update_differ and delete_differ conflicts can only be detected when track_commit_timestamp is enabled on the subscriber. We do not offer additional logging for exclusion constraint violations because these constraints can specify rules that are more complex than simple equality checks. Resolving such conflicts won't be straightforward. This area can be further enhanced if required. Author: Hou Zhijie Reviewed-by: Shveta Malik, Amit Kapila, Nisha Moond, Hayato Kuroda, Dilip Kumar Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com
*	Fix more holes with SLRU code in need of int64 for segment numbers	Michael Paquier	2024-08-19
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a continuation of c9e24573905b, containing changes included into the proposed patch that have been missed in the actual commit. I have managed to miss these diffs while doing a rebase of the original patch. Thanks to Noah Misch, Peter Eisentraut and Alexander Korotkov for the pokes. Discussion: https://postgr.es/m/92fe572d-638e-4162-aef6-1c42a2936f25@eisentraut.org Discussion: https://postgr.es/m/20240810175055.cd.nmisch@google.com Backpatch-through: 17
*	Search for SLRU page only in its own bank	Alvaro Herrera	2024-08-18
\| \| \| \| \| \| \| \| \| \| \|	One of the two slot scans in SlruSelectLRUPage was not walking only the slots in the specific bank where the buffer could be; change it to do that. Oversight in 53c2a97a9266. Author: Sergey Sargsyan <sergey.sargsyan.2001@gmail.com> Discussion: https://postgr.es/m/18582-5f301dd30ba91a38@postgresql.org
*	Fix comments on wal_level=minimal, CREATE TABLESPACE and CREATE DATABASE.	Noah Misch	2024-08-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 97ddda8a82ac470ae581d0eb485b6577707678bc removed the rmtree() behavior from XLOG_TBLSPC_CREATE, obsoleting that part of the comment. The comment's point about XLOG_DBASE_CREATE was wrong when commit fa0f466d5329e10b16f3b38c8eaf5306f7e234e8 introduced the point. (It would have been accurate if that commit had predated commit fbcbc5d06f53aea412130deb52e216aa3883fb8d introducing the second checkpoint of CREATE DATABASE.) Nothing can skip log_smgrcreate() on the basis of wal_level=minimal, so don't comment on that. Commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 expanded WAL skipping from five specific operations to relfilenodes generally, hence the CreateDatabaseUsingFileCopy() comment change. Discussion: https://postgr.es/m/20231008022204.cc@rfd.leadboat.com
*	Use pgBufferUsage for buffer usage tracking in analyze.	Masahiko Sawada	2024-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, (auto)analyze used global variables VacuumPageHit, VacuumPageMiss, and VacuumPageDirty to track buffer usage. However, pgBufferUsage provides a more generic way to track buffer usage with support functions. This change replaces those global variables with pgBufferUsage in analyze. Since analyze was the sole user of those variables, it removes their declarations. Vacuum previously used those variables but replaced them with pgBufferUsage as part of a bug fix, commit 5cd72cc0c. Additionally, it adjusts the buffer usage message in both vacuum and analyze for better consistency. Author: Anthonin Bonnefoy Reviewed-by: Masahiko Sawada, Michael Paquier Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com
*	Give nbtree move right function internal linkage.	Peter Geoghegan	2024-08-12
\| \| \| \| \| \| \| \|	Declare _bt_moveright() static. This is a minor modularity win; the routine was already private to nbtsearch.c for all practical purposes. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2WgWVzCNEXQB_op5MMZMDgJ3fg3AhVm6bq2iZPpJNXGhWw@mail.gmail.com
*	Fix a series of typos and outdated references	David Rowley	2024-08-12
\| \| \| \| \|	Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
*	Fix bad indentation introduced in commit f011e82c2c	Heikki Linnakangas	2024-08-12
\|
*	Avoid unneeded nbtree backwards scan buffer locks.	Peter Geoghegan	2024-08-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Teach nbtree backwards scans to avoid relocking a just-read leaf page to read its current left sibling link when it isn't truly necessary. This happened inside _bt_readnextpage whenever _bt_readpage had already determined that there'll be no further matches to the left (or at least none for the current primitive index scan, for a scan with array keys). A new precheck inside _bt_readnextpage is all that we need to avoid these useless lock acquisitions. Arguably, using a precheck like this was a missed opportunity for commit 2ed5b87f96, which taught nbtree to drop leaf page pins early to avoid blocking cleanup by VACUUM. Forwards scans already managed to avoid relocking the page like this. The optimization added by this commit is particularly helpful with backwards scans that use array keys where the scan must perform multiple primitive index scans. Such backwards scans will now avoid a useless leaf page re-lock at the end of each primitive index scan. Note that this commit does not attempt to avoid needlessly re-locking a leaf page that was just read when the scan must follow the leaf page's left link. That more ambitious optimization could work by stashing the left link when the page is first read by a backwards scan, allowing the subsequent _bt_readnextpage call to optimistically skip re-reading the original page just to get a new copy of its left link. For now we only address cases where we don't care about our original page's left link. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=xgs7PojG=EUvhgadwENzu_mY_riNh-w9wFPsaS717ew@mail.gmail.com
*	Initialize HASHCTL differently, to suppress Coverity warning	Heikki Linnakangas	2024-08-11
\| \| \| \| \| \| \| \|	Coverity complained that the hash_create() call might access hash_table_ctl->hctl. That's a false alarm, hash_create() only accesses that field when passed the HASH_SHARED_MEM flag. Try to silence it by using a plain local variable instead of a const. That's how the HASHCTL is initialized in all the other hash_create() calls.
*	Mark misc static global variables as const	Heikki Linnakangas	2024-08-06
\| \| \| \| \|	Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
*	Turn a few 'validnsps' static variables into locals	Heikki Linnakangas	2024-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \|	There was no need for these to be static buffers, local variables work just as well. I think they were marked as 'static' to imply that they are read-only, but 'const' is more appropriate for that, so change them to const. To make it possible to mark the variables as 'const', also add 'const' decorations to the transformRelOptions() signature. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
*	Allow parallel workers to cope with a newly-created session user ID.	Tom Lane	2024-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Parallel workers failed after a sequence like BEGIN; CREATE USER foo; SET SESSION AUTHORIZATION foo; because check_session_authorization could not see the uncommitted pg_authid row for "foo". This is because we ran RestoreGUCState() in a separate transaction using an ordinary just-created snapshot. The same disease afflicts any other GUC that requires catalog lookups and isn't forgiving about the lookups failing. To fix, postpone RestoreGUCState() into the worker's main transaction after we've set up a snapshot duplicating the leader's. This affects check_transaction_isolation and check_transaction_deferrable, which think they should only run during transaction start. Make them act like check_transaction_read_only, which already knows it should silently accept the value when InitializingParallelWorker. This un-reverts commit f5f30c22e. The original plan was to back-patch that, but the fact that 0ae5b763e proved to be a pre-requisite shows that the subtle API change for GUC hooks might actually break some of them. The problem we're trying to fix seems not worth taking such a risk for in stable branches. Per bug #18545 from Andrey Rachitskiy. Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
*	Clean up handling of client_encoding GUC in parallel workers.	Tom Lane	2024-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous coding here threw an error from assign_client_encoding if it was invoked in a parallel worker. That's a very fundamental violation of the GUC hook API: assign hooks must not throw errors. The place to complain is in the check hook, so move the test to there, and use the regular check-hook API (ie return false) to report it. The reason this coding is a problem is that it breaks GUC rollback, which may occur after we leave InitializingParallelWorker state. That case seems not actually reachable before now, but commit f5f30c22e made it reachable, so we need to fix this before that can be un-reverted. In passing, improve the commentary in ParallelWorkerMain, and add a check for failure of SetClientEncoding. That's another case that can't happen now but might become possible after foreseeable code rearrangements (notably, if the shortcut of skipping PrepareClientEncoding stops being OK). Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org