postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Fix typo in comment of gistdoinsert().	Masahiko Sawada	2024-11-04
\| \| \| \| \| \|	Author: Tender Wang Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAHewXN%3D3sH2sNw4nC3QGCEVw1Lftmw9m5y1Xje0bXK6ApDrsPQ%40mail.gmail.com
*	Fix obsolete _bt_first comments.	Peter Geoghegan	2024-11-04
\| \| \| \| \| \| \|	_bt_first doesn't necessarily hold onto a buffer pin on success exit. Fix header comments that claimed that we'll always hold onto a pin. Oversight in commit 2ed5b87f96.
*	nbtree: Remove useless 'strat' local variable.	Peter Geoghegan	2024-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove a local variable that was used to avoid overwriting strat_total with the = operator strategy when a >= operator strategy key was already included in the initial positioning/insertion scan keys by _bt_first (for backwards scans it would have to be a <= key that was included). _bt_first's strat_total local variable now simply tracks the operator strategy of the final scan key that was included in the scan's insertion scan key (barring the case where the !used_all_subkeys row compare path adjusts strat_total in its own way). _bt_first already treated >= keys (or <= keys) as = keys for initial positioning purposes. There is no good reason to remember that that was what happened; no later _bt_first step cares about the distinction. Note, in particular, that the insertion scan key's 'nextkey' and 'backward' fields will be initialized the same way regardless. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
*	Split ProcSleep function into JoinWaitQueue and ProcSleep	Heikki Linnakangas	2024-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split ProcSleep into two functions: JoinWaitQueue and ProcSleep. JoinWaitQueue is called while holding the partition lock, and inserts the current process to the wait queue, while ProcSleep() does the actual sleeping. ProcSleep() is now called without holding the partition lock, and it no longer re-acquires the partition lock before returning. That makes the wakeup a little cheaper. Once upon a time, re-acquiring the partition lock was needed to prevent a signal handler from longjmping out at a bad time, but these days our signal handlers just set flags, and longjmping can only happen at points where we explicitly run CHECK_FOR_INTERRUPTS(). If JoinWaitQueue detects an "early deadlock" before even joining the wait queue, it returns without changing the shared lock entry, leaving the cleanup of the shared lock entry to the caller. This makes the handling of an early deadlock the same as the dontWait=true case. One small user-visible side-effect of this refactoring is that we now only set the 'ps' title to say "waiting" when we actually enter the sleep, not when the lock is skipped because dontWait=true, or when a deadlock is detected early before entering the sleep. This eliminates the 'lockAwaited' global variable in proc.c, which was largely redundant with 'awaitedLock' in lock.c Note: Updating the local lock table is now the caller's responsibility. JoinWaitQueue and ProcSleep are now only responsible for modifying the shared state. Seems a little nicer that way. Based on Thomas Munro's earlier patch and observation that ProcSleep doesn't really need to re-acquire the partition lock. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
*	Move TRACE calls into WaitOnLock()	Heikki Linnakangas	2024-11-04
\| \| \| \| \| \| \| \|	LockAcquire is a long and complex function. Pushing more stuff to its subroutines makes it a little more manageable. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
*	Set MyProc->heldLocks in ProcSleep	Heikki Linnakangas	2024-11-04
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, ProcSleep()'s caller was responsible for setting MyProc->heldLocks, and we had comments to remind about that. But it seems simpler to make ProcSleep() itself responsible for it. ProcSleep() already set the other info about the lock its waiting for (waitLock, waitProcLock and waitLockMode), so it is natural for it to set heldLocks too. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
*	Clarify nbtree parallel scan _bt_endpoint contract.	Peter Geoghegan	2024-11-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_bt_endpoint is a helper function for _bt_first that's called whenever no useful insertion scan key can be used, and we need to lock and read either the leftmost or rightmost leaf page in the index. Simplify and document its preconditions, relieving its _bt_first caller from having to end the parallel scan when it returns false. Also stop unnecessarily invalidating the current scan position in nearby code in both _bt_first and _bt_endpoint. This seems to have been copy-pasted from _bt_readnextpage, where invalidating the scan's current position really is necessary. Follow-up to the refactoring work in commit 1bd4bc85.
*	Fix comment in LockReleaseAll() on when locallock->nLock can be zero	Heikki Linnakangas	2024-11-04
\| \| \| \| \| \| \| \|	We reach this case also e.g. when a deadlock is detected, not only when we run out of memory. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
*	Suppress new "may be used uninitialized" warning.	Noah Misch	2024-11-02
\| \| \| \| \| \|	Buildfarm member mamba fails to deduce that the function never uses this variable without initializing it. Back-patch to v12, like commit b412f402d1e020c5dac94f3bf4a005db69519b99.
*	Fix inplace update buffer self-deadlock.	Noah Misch	2024-11-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A CacheInvalidateHeapTuple* callee might call CatalogCacheInitializeCache(), which needs a relcache entry. Acquiring a valid relcache entry might scan pg_class. Hence, to prevent undetected LWLock self-deadlock, CacheInvalidateHeapTuple* callers must not hold BUFFER_LOCK_EXCLUSIVE on buffers of pg_class. Move the CacheInvalidateHeapTupleInplace() before the BUFFER_LOCK_EXCLUSIVE. No back-patch, since I've reverted commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 from non-master branches. Reported by Alexander Lakhin. Reviewed by Alexander Lakhin. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
*	Move I/O before the index_update_stats() buffer lock region.	Noah Misch	2024-11-02
\| \| \| \| \| \| \| \| \| \| \|	Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 enlarged the work done here under the pg_class heap buffer lock. Two preexisting actions are best done before holding that lock. Both RelationGetNumberOfBlocks() and visibilitymap_count() do I/O, and the latter might exclusive-lock a visibility map buffer. Moving these reduces contention and risk of undetected LWLock deadlock. Back-patch to v12, like that commit. Discussion: https://postgr.es/m/20241031200139.b4@rfd.leadboat.com
*	Clarify nbtree array preprocessing comment.	Peter Geoghegan	2024-11-01
\| \| \| \|	Oversight in commit 5bf748b8.
*	Rename two functions that wake up other processes	Heikki Linnakangas	2024-11-01
\| \| \| \| \| \| \| \| \| \| \|	Instead of talking about setting latches, which is a pretty low-level mechanism, emphasize that they wake up other processes. This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
*	Use ProcNumbers instead of direct Latch pointers to address other procs	Heikki Linnakangas	2024-11-01
\| \| \| \| \| \| \| \|	This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
*	Remove use of pg_memory_is_all_zeros() in bufpage.c	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \|	After a closer lookup, this makes the all-zero check of the page more expensive, so let's remove the new function call in bufpage.c. The maths of the check were also incorrect, checking that the page was full of zeros only for the first 1kB. This brings back the code to the state it was at 49d6c7d8daba. Per discussion with David Rowley and Bertrand Drouvot. Discussion: https://postgr.es/m/CAApHDvrXzPAr3FxoBuB7b3D-okNoNA2jxLun1rW8Yw5wkbqusw@mail.gmail.com
*	Add pg_memory_is_all_zeros() in memutils.h	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new function tests if a memory region starting at a given location for a defined length is made only of zeroes. This unifies in a single path the all-zero checks that were happening in a couple of places of the backend code: - For pgstats entries of relation, checkpointer and bgwriter, where some "all_zeroes" variables were previously used with memcpy(). - For all-zero buffer pages in PageIsVerifiedExtended(). This new function uses the same forward scan as the check for all-zero buffer pages, applying it to the three pgstats paths mentioned above. Author: Bertrand Drouvot Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith Discussion: https://postgr.es/m/ZupUDDyf1hHI4ibn@ip-10-97-1-34.eu-west-3.compute.internal
*	Add SQL function array_reverse()	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function takes in input an array, and reverses the position of all its elements. This operation only affects the first dimension of the array, like array_shuffle(). The implementation structure is inspired by array_shuffle(), with a subroutine called array_reverse_n() that may come in handy in the future, should more functions able to reverse portions of arrays be introduced. Bump catalog version. Author: Aleksander Alekseev Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
*	Make all ereport() calls within gram.y provide error locations.	Tom Lane	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch responds to a comment that I (tgl) made in the discussion leading up to 774171c4f, that really all errors occurring during raw parsing should provide error cursors. Syntax errors reported by Bison will have one, and most of the handwritten ereport's in gram.y already provide one, but there were a few stragglers. (It is not claimed that this handles every failure reachable during raw parsing --- out-of-memory is an obvious exception. But this makes a good start on cases that are likely to occur.) While we're at it, clean up the reported positions for errors associated with LIMIT/OFFSET clauses. Previously we were relying on applying exprLocation() to the contained expressions, but that leads to slightly odd cursor placement, e.g. regression=# (select * from foo limit 10) limit 10; ERROR: multiple LIMIT clauses not allowed LINE 1: (select * from foo limit 10) limit 10; ^ We can afford to keep a little more state in the transient SelectLimit structs in order to make that better. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
*	Add a parse location field to struct FunctionParameter.	Tom Lane	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows an error cursor to be supplied for a bunch of bad-function-definition errors that previously lacked one, or that cheated a bit by pointing at the contained type name when the error isn't really about that. Bump catversion from an abundance of caution --- I don't think this node type can actually appear in stored views/rules, but better safe than sorry. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
*	Fix refreshing physical relfilenumber on shared index	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buildfarm member 'prion', which is configured with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, failed with errors like this: ERROR: could not read blocks 0..0 in file "global/2672": read only 0 of 8192 bytes while running a parallel test group that includes VACUUM FULL on some catalog tables among other things. I was not able to reproduce that just by running the tests with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, even though 'prion' hit it on first run after commit 2b9b8ebbf8, so there might be something else that makes it more susceptible to the race. However, I was able to reproduce it by adding another test to the same test group that runs "vacuum full pg_database" repeatedly. The problem is that RelationReloadIndexInfo() no longer calls RelationInitPhysicalAddr() on a nailed, shared index, when an invalidation happens early during backend startup, before the critical relcaches have been built. Before commit 2b9b8ebbf8, that was done by RelationReloadNailed(), but it went missing from that path. Add it back as an explicit step. Broken by commit 2b9b8ebbf8, which refactored these functions. Discussion: https://www.postgresql.org/message-id/db876575-8f5b-4193-a538-df7e1f92d47a%40iki.fi
*	Remove duplicate words in comments	Daniel Gustafsson	2024-10-31
\| \| \| \| \| \| \| \|	A few comments contained duplicate "the" in sentences, fix by removing one occurrence. Author: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
*	Split RelationClearRelation into three different functions	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old RelationClearRelation function did different things depending on the arguments and circumstances. It could: a) remove the relation completely from relcache (rebuild == false), b) mark the entry as invalid (rebuild == true, but not in xact), or c) rebuild the entry (rebuild == true). Different callers used it for different purposes, and often assumed a particular behavior, which was confusing. Split it into three different functions, one for each of the above actions (one of them, RelationInvalidateRelation, was already added in commit e6cd857726). Move the responsibility of choosing the action and calling the right function to the callers. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
*	Simplify call to rebuild relcache entry for indexes	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RelationClearRelation(rebuild == true) calls RelationReloadIndexInfo() for indexes. We can rely on that in RelationIdGetRelation(), instead of calling RelationReloadIndexInfo() directly. That simplifies the code a little. In the passing, add a comment in RelationBuildLocalRelation() explaining why it doesn't call RelationInitIndexAccessInfo(). It's because at index creation, it's called before the pg_index row has been created. That's also the reason that RelationClearRelation() still needs a special case to go through the full-blown rebuild if the index support information in the relcache entry hasn't been populated yet. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
*	Remove unused field from SubPlanState struct	David Rowley	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bf6c614a2 did some conversion work to use ExprState instead of manually calling equality functions to check if one set of values is not distinct from another set. That patch removed many of the fields that became redundant as a result of that change, but it forgot to remove SubPlanState.tab_eq_funcs. Fix that. In passing, fix the header comment for TupleHashEntryData to correctly spell the field name it's talking about. Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CA+FpmFeycdombFzrjZw7Rmc29CVm4OOzCWwu=dVBQ6q=PX8SvQ@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvrWR2jYVhec=COyF2g2BE_ns91NDsCHAMFiXbyhEujKdQ@mail.gmail.com
*	nbtree: assert no scheduled primscan between pages.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \|	Follow-up to bugfix commit 763d65ae. Technically this new assertion is redundant with the assertion recently added to _bt_readpage by that same commit, but it seems like a good idea to have both. The new assertion makes it clear that we expect to call _bt_readnextpage when there's another primitive index scan scheduled, though only when needed as the final step of ending the current primitive scan.
*	Clarify nbtree array exhaustion comments.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Strictly speaking, we only need to make sure to leave the scan's array keys in their final positions (final for the current scan direction) to handle SAOP array exhaustion because btgettuple might only return a subset of the items for the final page (final for the current scan direction), before the scan changes direction. While it's typical for so->currPos to be invalidated shortly after the scan's arrays are first exhausted, and while so->currPos invalidation does obviate the need to leave the scan's arrays in any particular state, we can't rely on any of that actually happening when handling array exhaustion. Adjust comments to make all of that a lot clearer. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution.
*	Fix bug in nbtree array primitive scan scheduling.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bug in nbtree's handling of primitive index scan scheduling could lead to wrong answers when a scrollable cursor was used with an index scan that had a SAOP index qual. Wrong answers were only possible when the scan direction changed after a primitive scan was scheduled, but before _bt_next was asked to fetch the next tuple in line (i.e. for things to break, _bt_next had to be denied the opportunity to step off the page in the same direction as the one used when the primscan was scheduled). Furthermore, the issue only occurred when the page in question happened to be the first page to be visited by the entire top-level scan; the issue hinged upon the cursor backing up to the absolute beginning of the key space that it returns tuples from (fetching in the opposite scan direction across a "primitive scan boundary" always worked correctly). To fix, make _bt_next unset the "needs primitive index scan" flag when it detects that the current scan direction is not the one that was used by _bt_readpage back when the primitive scan in question was scheduled. This fixes the cases that are known to be faulty, and also seems like a good idea on general robustness grounds. Affected scrollable cursor cases now avoid a spurious primitive index scan when they fetch backwards to the absolute start of the key space to be visited by their cursor. Fetching backwards now only returns those tuples at the start of the scan, as expected. It'll also be okay to once again fetch forwards from the start at that point, since the scan will be left in a state that's exactly consistent with the state it was in before any tuples were ever fetched, as expected. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wznv49bFsE2jkt4GuZ0tU2C91dEST=50egzjY2FeOcHL4Q@mail.gmail.com Backpatch: 17-, where commit 5bf748b8 first appears.
*	Fix some more bugs in foreign keys connecting partitioned tables	Álvaro Herrera	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* In DetachPartitionFinalize() we were applying a tuple conversion map to tuples that didn't need one, which can lead to erratic behavior if a partitioned table has a partition with a different column order, as reported by Alexander Lakhin. This was introduced by 53af9491a043. Don't do that. Also, modify a recently added test case to exercise this. * The same function as well as CloneFkReferenced() were acquiring AccessShareLock on a partition, only to have CreateTrigger() later acquire ShareRowExclusiveLock on it. This can lead to deadlock by lock escalation, unnecessarily. Avoid that by acquiring the stronger lock to begin with. This probably dates back to branch 12, but I have never seen a report of this being a problem in the field. * Innocuous but wasteful: also introduced by 53af9491a043, we were reading a pg_constraint tuple from syscache that we don't need, as reported by Tender Wang. Don't. Backpatch to 15. Discussion: https://postgr.es/m/461e9c26-2076-8224-e119-84998b6a784e@gmail.com
*	Replicate generated columns when specified in the column list.	Amit Kapila	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit allows logical replication to publish and replicate generated columns when explicitly listed in the column list. We also ensured that the generated columns were copied during the initial tablesync when they were published. We will allow to replicate generated columns even when they are not specified in the column list (via a new publication option) in a separate commit. The motivation of this work is to allow replication for cases where the client doesn't have generated columns. For example, the case where one is trying to replicate data from Postgres to the non-Postgres database. Author: Shubham Khanna, Vignesh C, Hou Zhijie Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
*	Add missing CommandCounterIncrement() in stats import functions.	Jeff Davis	2024-10-29
\| \| \| \| \|	Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/98b2fcf0-f701-369e-d63d-6be9739ce17c@gmail.com
*	Unpin buffer before inplace update waits for an XID to end.	Noah Misch	2024-10-29
\| \| \| \| \| \| \| \| \| \| \| \|	Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. By keeping the pin during that wait, a sequence of autovacuum workers and an uncommitted GRANT starved one foreground LockBufferForCleanup() for six minutes, on buildfarm member sarus. Prevent, at the cost of a bit of complexity. Back-patch to v12, like the earlier commit. That commit and heap_inplace_lock() have not yet appeared in any release. Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
*	Reduce variable scope and possibly useless palloc	David Rowley	2024-10-30
\| \| \| \| \| \| \| \|	Move the CreateStmt down to the branch that it's used in, thus preventing the makeNode() call in cases where the CreateStmt isn't used. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAq=06YPWPhS+yyTbCwv5JLKRz8rm3dWx6JR5Uj_d_fQDA@mail.gmail.com
*	Fix dependency of partitioned table and table AM with CREATE TABLE .. USING	Michael Paquier	2024-10-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A pg_depend entry between a partitioned table and its table access method was missing when using CREATE TABLE .. USING with an unpinned access method. DROP ACCESS METHOD could be used, while it should be blocked if CASCADE is not specified, even if there was a partitioned table that depends on the table access method. pg_class.relam would then hold an orphaned OID value still pointing to the AM dropped. The problem is fixed by adding a dependency between the partitioned table and its table access method if set when the relation is created. A test checking the contents of pg_depend in this case is added. Issue introduced in 374c7a229042, that has added support for CREATE TABLE .. USING for partitioned tables. Reviewed-by: Alexander Lakhin Discussion: https://postgr.es/m/18674-1ef01eceec278fab@postgresql.org Backpatch-through: 17
*	Ensure we have a snapshot when updating pg_index in index_drop().	Nathan Bossart	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I assumed that all index_drop() callers set an active snapshot beforehand, but that is evidently not true. One counterexample is autovacuum, which doesn't set an active snapshot when cleaning up orphan temp indexes. To fix, unconditionally push an active snapshot before updating pg_index in index_drop(). Oversight in commit b52adbad46. Reported-by: Masahiko Sawada Reviewed-by: Stepan Neretin, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBgF9etQrXbN9or_YHsmBRJHHNUEkhHp9rGK9CyQv5aTQ%40mail.gmail.com
*	Strip Windows newlines from extension script files manually.	Tom Lane	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revert commit 924e03917 in favor of adding code to convert \r\n to \n explicitly, on Windows only. The idea of letting text mode do the work fails for a couple of reasons: * Per Microsoft documentation, text mode also causes control-Z to be interpreted as end-of-file. While it may be unlikely that extension scripts contain control-Z, we've historically allowed it, and breaking the case doesn't seem wise. * Apparently, on some Windows configurations, "r" mode is interpreted as binary not text mode. We could force it with "rt" but that would be inconsistent with our code elsewhere, and it would still require Windows-specific coding. Thanks to Alexander Lakhin for investigation. Discussion: https://postgr.es/m/79284195-4993-7b00-f6df-8db28ca60fa3@gmail.com
*	Fix WAL_DEBUG build	Peter Eisentraut	2024-10-28
\| \| \| \| \| \|	broken by commit e18512c000e Reported-by: Peter Geoghegan <pg@bowt.ie>
*	nbtree: Minor sibling link traversal tweaks.	Peter Geoghegan	2024-10-28
\| \| \| \| \| \| \|	Tweak some code comments for clarity, and relocate some local variable declarations to the scope where they're actually used. Follow-up to recent commit 1bd4bc85.
*	Change the default value of the streaming option to 'parallel'.	Amit Kapila	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the default value of streaming option for a subscription was 'off'. The parallel option indicates that the changes in large transactions (greater than logical_decoding_work_mem) are to be applied directly via one of the parallel apply workers, if available. The parallel mode was introduced in 16, but we refrain from enabling it by default to avoid seeing any unpleasant behavior in the existing applications. However we haven't found any such report yet, so this is a good time to enable it by default. Reported-by: Vignesh C Author: Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CALDaNm1=MedhW23NuoePJTmonwsMSp80ddsw+sEJs0GUMC_kqQ@mail.gmail.com
*	Set query ID for inner queries of CREATE TABLE AS and DECLARE	Michael Paquier	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some utility statements contain queries that can be planned and executed: CREATE TABLE AS and DECLARE CURSOR. This commit adds query ID computation for the inner queries executed by these two utility commands, with and without EXPLAIN. This change leads to four new callers of JumbleQuery() and post_parse_analyze_hook() so as extensions can decide what to do with this new data. Previously, extensions relying on the query ID, like pg_stat_statements, were not able to track these nested queries as the query_id was 0. For pg_stat_statements, this commit leads to additions under !toplevel when pg_stat_statements.track is set to "all", as shown in its regression tests. The output of EXPLAIN for these two utilities gains a "Query Identifier" if compute_query_id is enabled. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
*	Fix obsolete nbtree split buffer comment.	Peter Geoghegan	2024-10-27
\| \| \| \|	Oversight in commit d088ba5a.
*	Remove unused #include's from backend .c files	Peter Eisentraut	2024-10-27
\| \| \| \| \| \| \| \|	as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
*	Refactor the code to create a pg_locale_t into new function.	Jeff Davis	2024-10-25
\| \| \| \| \|	Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/59da7ee4-5e1a-4727-b464-a603c6ed84cd@proxel.se
*	Read extension script files in text not binary mode.	Tom Lane	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change affects only Windows, where it should cause DOS-style newlines (\r\n) to be converted to plain \n during script loading. This eliminates one potential discrepancy in the behavior of extension script files between Windows and non-Windows. While there's a small chance that this might cause undesirable behavior changes for some extensions, it can also be argued that this may remove behavioral surprises for others. An example is that in the buildfarm, we are getting different results for the tests added by commit 774171c4f depending on whether our git tree has been checked out with Unix or DOS newlines. The choice to use binary mode goes all the way back to our invention of extensions in commit d9572c4e3. However, I suspect it was not thought through carefully but was just a side-effect of the ready availability of an almost-suitable function read_binary_file(). On balance, changing to text mode seems like a better answer than other ways in which we might fix the inconsistent test results. Discussion: https://postgr.es/m/2480333.1729784872@sss.pgh.pa.us
*	Make table_scan_bitmap_next_block() async-friendly	Melanie Plageman	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all responsibility for indicating a block is exhuasted into table_scan_bitmap_next_tuple() and advance the main iterator in heap-specific code. This flow control makes more sense and is a step toward using the read stream API for bitmap heap scans. Previously, table_scan_bitmap_next_block() returned false to indicate table_scan_bitmap_next_tuple() should not be called for the tuples on the page. This happened both when 1) there were no visible tuples on the page and 2) when the block returned by the iterator was past the end of the table. BitmapHeapNext() (generic bitmap table scan code) handled the case when the bitmap was exhausted. It makes more sense for table_scan_bitmap_next_tuple() to return false when there are no visible tuples on the page and table_scan_bitmap_next_block() to return false when the bitmap is exhausted or there are no more blocks in the table. As part of this new design, TBMIterateResults are no longer used as a flow control mechanism in BitmapHeapNext(), so we removed table_scan_bitmap_next_tuple's TBMIterateResult parameter. Note that the prefetch iterator is still saved in the BitmapHeapScanState node and advanced in generic bitmap table scan code. This is because 1) it was not necessary to change the prefetch iterator location to change the flow control in BitmapHeapNext() 2) modifying prefetch iterator management requires several more steps better split over multiple commits and 3) the prefetch iterator will be removed once the read stream API is used. Author: Melanie Plageman Reviewed-by: Tomas Vondra, Andres Freund, Heikki Linnakangas, Mark Dilger Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
*	Move EXPLAIN counter increment to heapam_scan_bitmap_next_block	Melanie Plageman	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Increment the lossy and exact page counters for EXPLAIN of bitmap heap scans in heapam_scan_bitmap_next_block(). Note that other table AMs will need to do this as well Pushing the counters into heapam_scan_bitmap_next_block() is required to be able to use the read stream API for bitmap heap scans. The bitmap iterator must be advanced from inside the read stream callback, so TBMIterateResults cannot be used as a flow control mechanism in BitmapHeapNext(). Author: Melanie Plageman Reviewed-by: Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
*	WAL-log inplace update before revealing it to other sessions.	Noah Misch	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \|	A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v12 (all supported versions). In v14 and earlier, this also back-patches the assertion removal from commit 7fcf2faf9c7dd473208fd6d5565f88d7f733782b. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com
*	For inplace update, send nontransactional invalidations.	Noah Misch	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
*	Refactor code converting a publication name List to a StringInfo	Michael Paquier	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing get_publications_str() is renamed to GetPublicationsStr() and is moved to pg_subscription.c, so as it is possible to reuse it at two locations of the tablesync code where the same logic was duplicated. fetch_remote_table_info() was doing two List->StringInfo conversions when dealing with a server of version 15 or newer. The conversion happens only once now. This refactoring leads to less code overall. Author: Peter Smith Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/CAHut+PtJMk4bKXqtpvqVy9ckknCgK9P6=FeG8zHF=6+Em_Snpw@mail.gmail.com
*	Remove the RTE_GROUP RTE if we drop the groupClause	Richard Guo	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For an EXISTS subquery, the only thing that matters is whether it returns zero or more than zero rows. Therefore, we remove certain SQL features that won't affect that, among them the GROUP BY clauses. After we drop the groupClause, we'd better remove the RTE_GROUP RTE and clear the hasGroupRTE flag, as they depend on the groupClause. Failing to do so could result in a bogus RTE_GROUP entry in the parent query, leading to an assertion failure on the hasGroupRTE flag. Reported-by: David Rowley Author: Richard Guo Discussion: https://postgr.es/m/CAApHDvp2_yht8uPLyWO-kVGWZhYvx5zjGfSrg4fBQ9fsC13V0g@mail.gmail.com
*	Add functions pg_restore_relation_stats(), pg_restore_attribute_stats().	Jeff Davis	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to the pg_set_*_stats() functions, except with a variadic signature that's designed to be more future-proof. Additionally, most problems are reported as WARNINGs rather than ERRORs, allowing most stats to be restored even if some cannot. These functions are intended to be called from pg_dump to avoid the need to run ANALYZE after an upgrade. Author: Corey Huinker Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com