aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Remove an obsolete comment in gistinsert()Michael Paquier2024-11-07
| | | | | | | | | This is inconsistent since 1f7ef548ec2e where the definition of gistFormTuple() has changed. Author: Tender Wang Reviewed-by: Aleksander Alekseev Discussion: https://postgr.es/m/CAHewXNkjU95_HdioDVU=5yBq_Xt=GfBv=Od-0oKtiA006pWW7Q@mail.gmail.com
* Replicate generated columns when 'publish_generated_columns' is set.Amit Kapila2024-11-07
| | | | | | | | | | | | | | | | | | | | This patch builds on the work done in commit 745217a051 by enabling the replication of generated columns alongside regular column changes through a new publication parameter: publish_generated_columns. Example usage: CREATE PUBLICATION pub1 FOR TABLE tab_gencol WITH (publish_generated_columns = true); The column list takes precedence. If the generated columns are specified in the column list, they will be replicated even if 'publish_generated_columns' is set to false. Conversely, if generated columns are not included in the column list (assuming the user specifies a column list), they will not be replicated even if 'publish_generated_columns' is true. Author: Vignesh C, Shubham Khanna Reviewed-by: Peter Smith, Amit Kapila, Hayato Kuroda, Shlok Kyal, Ajin Cherian, Hou Zhijie, Masahiko Sawada Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
* Remove unused variableDaniel Gustafsson2024-11-06
| | | | | | | | | | | The low variable has not been used since it was added in d168b666823 and can be safely removed. The variable is present in the Sedgewick paper "Analysis of Shellsort and Related Algorithms" as a parameter to the shellsort function, but our implementation does not use it. Remove to improve readability of the code. Author: Koki Nakamura <btnakamurakoukil@oss.nttdata.com> Discussion: https://postgr.es/m/8aeb7b3eda53ca4c65fbacf8f43628fb@oss.nttdata.com
* doc: Remove event trigger firing matrixPeter Eisentraut2024-11-06
| | | | | | | | | | | | | | | | This is difficult to maintain accurately, and it was probably already somewhat incorrect, especially in the sql_drop and table_rewrite categories. The prior section already documented which DDL commands are *not* supported (which was also slightly outdated), so let's expand that a bit and just rely on that instead of listing out each command in full detail. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxE_UAuxcM08BW5oVsg34v0cFWoEt8yBa5xSAoKLmL6LTQ%40mail.gmail.com
* Monkey-patch LLVM code to fix ARM relocation bug.Thomas Munro2024-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supply a new memory manager for RuntimeDyld, to avoid crashes in generated code caused by memory placement that can overflow a 32 bit data type. This is a drop-in replacement for the llvm::SectionMemoryManager class in the LLVM library, with Michael Smith's proposed fix from https://www.github.com/llvm/llvm-project/pull/71968. We hereby slurp it into our own source tree, after moving into a new namespace llvm::backport and making some minor adjustments so that it can be compiled with older LLVM versions as far back as 12. It's harder to make it work on even older LLVM versions, but it doesn't seem likely that people are really using them so that is not investigated for now. The problem could also be addressed by switching to JITLink instead of RuntimeDyld, and that is the LLVM project's recommended solution as the latter is about to be deprecated. We'll have to do that soon enough anyway, and then when the LLVM version support window advances far enough in a few years we'll be able to delete this code. Unfortunately that wouldn't be enough for PostgreSQL today: in most relevant versions of LLVM, JITLink is missing or incomplete. Several other projects have already back-ported this fix into their fork of LLVM, which is a vote of confidence despite the lack of commit into LLVM as of today. We don't have our own copy of LLVM so we can't do exactly what they've done; instead we have a copy of the whole patched class so we can pass an instance of it to RuntimeDyld. The LLVM project hasn't chosen to commit the fix yet, and even if it did, it wouldn't be back-ported into the releases of LLVM that most of our users care about, so there is not much point in waiting any longer for that. If they make further changes and commit it to LLVM 19 or 20, we'll still need this for older versions, but we may want to resynchronize our copy and update some comments. The changes that we've had to make to our copy can be seen by diffing our SectionMemoryManager.{h,cpp} files against the ones in the tree of the pull request. Per the LLVM project's license requirements, a copy is in SectionMemoryManager.LICENSE. This should fix the spate of crash reports we've been receiving lately from users on large memory ARM systems. Back-patch to all supported releases. Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> (license aspects) Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
* Fix hypothetical bug in ExprState building for hashingDavid Rowley2024-11-06
| | | | | | | | | | | | | | | | | | | adf97c156 gave ExprStates the ability to hash expressions and return a single hash value. That commit supports seeding the hash value with an initial value to have that blended into the final hash value. Here we fix a hypothetical bug where if there are zero expressions to hash, the initial value is stored in the wrong location. The existing code stored the initial value in an intermediate location expecting that when the expressions were hashed that those steps would store the final hash value in the ExprState.resvalue field. However, that wouldn't happen when there are zero expressions to hash. The correct thing to do instead is to have a special case for zero expressions and when we hit that case, store the initial value directly in the ExprState.resvalue. The reason that this is a hypothetical bug is that no code currently calls ExecBuildHash32Expr passing a non-zero initial value. Discussion: https://postgr.es/m/CAApHDvpMAL_zxbMRr1LOex3O7Y7R7ZN2i8iUFLQhqQiJMAg3qw@mail.gmail.com
* Clear padding of PgStat_HashKey when handling pgstats entriesMichael Paquier2024-11-05
| | | | | | | | | | | | | | | | | | PgStat_HashKey is currently initialized in a way that could result in random data if the structure has any padding bytes. The structure has no padding bytes currently, fortunately, but it could become a problem should the structure change at some point in the future. The code is changed to use some memset(0) so as any padding would be handled properly, as it would be surprising to see random failures in the pgstats entry lookups. PgStat_HashKey is a structure internal to pgstats, and an ABI change could be possible in the scope of a bug fix, so backpatch down to 15 where this has been introduced. Author: Bertrand Drouvot Reviewed-by: Jelte Fennema-Nio, Michael Paquier Discussion: https://postgr.es/m/Zyb7RW1y9dVfO0UH@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 15
* Revert pg_wal_replay_wait() stored procedureAlexander Korotkov2024-11-04
| | | | | | | | | | | | | | | | This commit reverts 3c5db1d6b0, and subsequent improvements and fixes including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a, 2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b, 73da6b8d1b, and e546989a26. The reason for reverting is a set of remaining issues. Most notably, the stored procedure appears to need more effort than the utility statement to turn the backend into a "snapshot-less" state. This makes an approach to use stored procedures questionable. Catversion is bumped. Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
* Fix typo in comment of gistdoinsert().Masahiko Sawada2024-11-04
| | | | | | Author: Tender Wang Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/CAHewXN%3D3sH2sNw4nC3QGCEVw1Lftmw9m5y1Xje0bXK6ApDrsPQ%40mail.gmail.com
* Fix obsolete _bt_first comments.Peter Geoghegan2024-11-04
| | | | | | | _bt_first doesn't necessarily hold onto a buffer pin on success exit. Fix header comments that claimed that we'll always hold onto a pin. Oversight in commit 2ed5b87f96.
* nbtree: Remove useless 'strat' local variable.Peter Geoghegan2024-11-04
| | | | | | | | | | | | | | | | | | | | | Remove a local variable that was used to avoid overwriting strat_total with the = operator strategy when a >= operator strategy key was already included in the initial positioning/insertion scan keys by _bt_first (for backwards scans it would have to be a <= key that was included). _bt_first's strat_total local variable now simply tracks the operator strategy of the final scan key that was included in the scan's insertion scan key (barring the case where the !used_all_subkeys row compare path adjusts strat_total in its own way). _bt_first already treated >= keys (or <= keys) as = keys for initial positioning purposes. There is no good reason to remember that that was what happened; no later _bt_first step cares about the distinction. Note, in particular, that the insertion scan key's 'nextkey' and 'backward' fields will be initialized the same way regardless. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
* Split ProcSleep function into JoinWaitQueue and ProcSleepHeikki Linnakangas2024-11-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split ProcSleep into two functions: JoinWaitQueue and ProcSleep. JoinWaitQueue is called while holding the partition lock, and inserts the current process to the wait queue, while ProcSleep() does the actual sleeping. ProcSleep() is now called without holding the partition lock, and it no longer re-acquires the partition lock before returning. That makes the wakeup a little cheaper. Once upon a time, re-acquiring the partition lock was needed to prevent a signal handler from longjmping out at a bad time, but these days our signal handlers just set flags, and longjmping can only happen at points where we explicitly run CHECK_FOR_INTERRUPTS(). If JoinWaitQueue detects an "early deadlock" before even joining the wait queue, it returns without changing the shared lock entry, leaving the cleanup of the shared lock entry to the caller. This makes the handling of an early deadlock the same as the dontWait=true case. One small user-visible side-effect of this refactoring is that we now only set the 'ps' title to say "waiting" when we actually enter the sleep, not when the lock is skipped because dontWait=true, or when a deadlock is detected early before entering the sleep. This eliminates the 'lockAwaited' global variable in proc.c, which was largely redundant with 'awaitedLock' in lock.c Note: Updating the local lock table is now the caller's responsibility. JoinWaitQueue and ProcSleep are now only responsible for modifying the shared state. Seems a little nicer that way. Based on Thomas Munro's earlier patch and observation that ProcSleep doesn't really need to re-acquire the partition lock. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
* Move TRACE calls into WaitOnLock()Heikki Linnakangas2024-11-04
| | | | | | | | LockAcquire is a long and complex function. Pushing more stuff to its subroutines makes it a little more manageable. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
* Set MyProc->heldLocks in ProcSleepHeikki Linnakangas2024-11-04
| | | | | | | | | | | | Previously, ProcSleep()'s caller was responsible for setting MyProc->heldLocks, and we had comments to remind about that. But it seems simpler to make ProcSleep() itself responsible for it. ProcSleep() already set the other info about the lock its waiting for (waitLock, waitProcLock and waitLockMode), so it is natural for it to set heldLocks too. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
* Clarify nbtree parallel scan _bt_endpoint contract.Peter Geoghegan2024-11-04
| | | | | | | | | | | | | | | _bt_endpoint is a helper function for _bt_first that's called whenever no useful insertion scan key can be used, and we need to lock and read either the leftmost or rightmost leaf page in the index. Simplify and document its preconditions, relieving its _bt_first caller from having to end the parallel scan when it returns false. Also stop unnecessarily invalidating the current scan position in nearby code in both _bt_first and _bt_endpoint. This seems to have been copy-pasted from _bt_readnextpage, where invalidating the scan's current position really is necessary. Follow-up to the refactoring work in commit 1bd4bc85.
* Fix comment in LockReleaseAll() on when locallock->nLock can be zeroHeikki Linnakangas2024-11-04
| | | | | | | | We reach this case also e.g. when a deadlock is detected, not only when we run out of memory. Reviewed-by: Maxim Orlov Discussion: https://www.postgresql.org/message-id/7c2090cd-a72a-4e34-afaa-6dd2ef31440e@iki.fi
* Suppress new "may be used uninitialized" warning.Noah Misch2024-11-02
| | | | | | Buildfarm member mamba fails to deduce that the function never uses this variable without initializing it. Back-patch to v12, like commit b412f402d1e020c5dac94f3bf4a005db69519b99.
* Fix inplace update buffer self-deadlock.Noah Misch2024-11-02
| | | | | | | | | | | | | | | A CacheInvalidateHeapTuple* callee might call CatalogCacheInitializeCache(), which needs a relcache entry. Acquiring a valid relcache entry might scan pg_class. Hence, to prevent undetected LWLock self-deadlock, CacheInvalidateHeapTuple* callers must not hold BUFFER_LOCK_EXCLUSIVE on buffers of pg_class. Move the CacheInvalidateHeapTupleInplace() before the BUFFER_LOCK_EXCLUSIVE. No back-patch, since I've reverted commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 from non-master branches. Reported by Alexander Lakhin. Reviewed by Alexander Lakhin. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
* Move I/O before the index_update_stats() buffer lock region.Noah Misch2024-11-02
| | | | | | | | | | | Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 enlarged the work done here under the pg_class heap buffer lock. Two preexisting actions are best done before holding that lock. Both RelationGetNumberOfBlocks() and visibilitymap_count() do I/O, and the latter might exclusive-lock a visibility map buffer. Moving these reduces contention and risk of undetected LWLock deadlock. Back-patch to v12, like that commit. Discussion: https://postgr.es/m/20241031200139.b4@rfd.leadboat.com
* Clarify nbtree array preprocessing comment.Peter Geoghegan2024-11-01
| | | | Oversight in commit 5bf748b8.
* Rename two functions that wake up other processesHeikki Linnakangas2024-11-01
| | | | | | | | | | | Instead of talking about setting latches, which is a pretty low-level mechanism, emphasize that they wake up other processes. This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
* Use ProcNumbers instead of direct Latch pointers to address other procsHeikki Linnakangas2024-11-01
| | | | | | | | This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
* Remove use of pg_memory_is_all_zeros() in bufpage.cMichael Paquier2024-11-01
| | | | | | | | | | | | | After a closer lookup, this makes the all-zero check of the page more expensive, so let's remove the new function call in bufpage.c. The maths of the check were also incorrect, checking that the page was full of zeros only for the first 1kB. This brings back the code to the state it was at 49d6c7d8daba. Per discussion with David Rowley and Bertrand Drouvot. Discussion: https://postgr.es/m/CAApHDvrXzPAr3FxoBuB7b3D-okNoNA2jxLun1rW8Yw5wkbqusw@mail.gmail.com
* Add pg_memory_is_all_zeros() in memutils.hMichael Paquier2024-11-01
| | | | | | | | | | | | | | | | | This new function tests if a memory region starting at a given location for a defined length is made only of zeroes. This unifies in a single path the all-zero checks that were happening in a couple of places of the backend code: - For pgstats entries of relation, checkpointer and bgwriter, where some "all_zeroes" variables were previously used with memcpy(). - For all-zero buffer pages in PageIsVerifiedExtended(). This new function uses the same forward scan as the check for all-zero buffer pages, applying it to the three pgstats paths mentioned above. Author: Bertrand Drouvot Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith Discussion: https://postgr.es/m/ZupUDDyf1hHI4ibn@ip-10-97-1-34.eu-west-3.compute.internal
* Add SQL function array_reverse()Michael Paquier2024-11-01
| | | | | | | | | | | | | | | | | This function takes in input an array, and reverses the position of all its elements. This operation only affects the first dimension of the array, like array_shuffle(). The implementation structure is inspired by array_shuffle(), with a subroutine called array_reverse_n() that may come in handy in the future, should more functions able to reverse portions of arrays be introduced. Bump catalog version. Author: Aleksander Alekseev Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
* Make all ereport() calls within gram.y provide error locations.Tom Lane2024-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch responds to a comment that I (tgl) made in the discussion leading up to 774171c4f, that really all errors occurring during raw parsing should provide error cursors. Syntax errors reported by Bison will have one, and most of the handwritten ereport's in gram.y already provide one, but there were a few stragglers. (It is not claimed that this handles every failure reachable during raw parsing --- out-of-memory is an obvious exception. But this makes a good start on cases that are likely to occur.) While we're at it, clean up the reported positions for errors associated with LIMIT/OFFSET clauses. Previously we were relying on applying exprLocation() to the contained expressions, but that leads to slightly odd cursor placement, e.g. regression=# (select * from foo limit 10) limit 10; ERROR: multiple LIMIT clauses not allowed LINE 1: (select * from foo limit 10) limit 10; ^ We can afford to keep a little more state in the transient SelectLimit structs in order to make that better. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
* Add a parse location field to struct FunctionParameter.Tom Lane2024-10-31
| | | | | | | | | | | | | | | | This allows an error cursor to be supplied for a bunch of bad-function-definition errors that previously lacked one, or that cheated a bit by pointing at the contained type name when the error isn't really about that. Bump catversion from an abundance of caution --- I don't think this node type can actually appear in stored views/rules, but better safe than sorry. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
* Fix refreshing physical relfilenumber on shared indexHeikki Linnakangas2024-10-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Buildfarm member 'prion', which is configured with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, failed with errors like this: ERROR: could not read blocks 0..0 in file "global/2672": read only 0 of 8192 bytes while running a parallel test group that includes VACUUM FULL on some catalog tables among other things. I was not able to reproduce that just by running the tests with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, even though 'prion' hit it on first run after commit 2b9b8ebbf8, so there might be something else that makes it more susceptible to the race. However, I was able to reproduce it by adding another test to the same test group that runs "vacuum full pg_database" repeatedly. The problem is that RelationReloadIndexInfo() no longer calls RelationInitPhysicalAddr() on a nailed, shared index, when an invalidation happens early during backend startup, before the critical relcaches have been built. Before commit 2b9b8ebbf8, that was done by RelationReloadNailed(), but it went missing from that path. Add it back as an explicit step. Broken by commit 2b9b8ebbf8, which refactored these functions. Discussion: https://www.postgresql.org/message-id/db876575-8f5b-4193-a538-df7e1f92d47a%40iki.fi
* Remove duplicate words in commentsDaniel Gustafsson2024-10-31
| | | | | | | | A few comments contained duplicate "the" in sentences, fix by removing one occurrence. Author: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
* Split RelationClearRelation into three different functionsHeikki Linnakangas2024-10-31
| | | | | | | | | | | | | | | | | | | The old RelationClearRelation function did different things depending on the arguments and circumstances. It could: a) remove the relation completely from relcache (rebuild == false), b) mark the entry as invalid (rebuild == true, but not in xact), or c) rebuild the entry (rebuild == true). Different callers used it for different purposes, and often assumed a particular behavior, which was confusing. Split it into three different functions, one for each of the above actions (one of them, RelationInvalidateRelation, was already added in commit e6cd857726). Move the responsibility of choosing the action and calling the right function to the callers. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
* Simplify call to rebuild relcache entry for indexesHeikki Linnakangas2024-10-31
| | | | | | | | | | | | | | | | | | RelationClearRelation(rebuild == true) calls RelationReloadIndexInfo() for indexes. We can rely on that in RelationIdGetRelation(), instead of calling RelationReloadIndexInfo() directly. That simplifies the code a little. In the passing, add a comment in RelationBuildLocalRelation() explaining why it doesn't call RelationInitIndexAccessInfo(). It's because at index creation, it's called before the pg_index row has been created. That's also the reason that RelationClearRelation() still needs a special case to go through the full-blown rebuild if the index support information in the relcache entry hasn't been populated yet. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
* Remove unused field from SubPlanState structDavid Rowley2024-10-31
| | | | | | | | | | | | | | | | bf6c614a2 did some conversion work to use ExprState instead of manually calling equality functions to check if one set of values is not distinct from another set. That patch removed many of the fields that became redundant as a result of that change, but it forgot to remove SubPlanState.tab_eq_funcs. Fix that. In passing, fix the header comment for TupleHashEntryData to correctly spell the field name it's talking about. Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CA+FpmFeycdombFzrjZw7Rmc29CVm4OOzCWwu=dVBQ6q=PX8SvQ@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvrWR2jYVhec=COyF2g2BE_ns91NDsCHAMFiXbyhEujKdQ@mail.gmail.com
* nbtree: assert no scheduled primscan between pages.Peter Geoghegan2024-10-30
| | | | | | | | | | Follow-up to bugfix commit 763d65ae. Technically this new assertion is redundant with the assertion recently added to _bt_readpage by that same commit, but it seems like a good idea to have both. The new assertion makes it clear that we expect to call _bt_readnextpage when there's another primitive index scan scheduled, though only when needed as the final step of ending the current primitive scan.
* Clarify nbtree array exhaustion comments.Peter Geoghegan2024-10-30
| | | | | | | | | | | | | | | | Strictly speaking, we only need to make sure to leave the scan's array keys in their final positions (final for the current scan direction) to handle SAOP array exhaustion because btgettuple might only return a subset of the items for the final page (final for the current scan direction), before the scan changes direction. While it's typical for so->currPos to be invalidated shortly after the scan's arrays are first exhausted, and while so->currPos invalidation does obviate the need to leave the scan's arrays in any particular state, we can't rely on any of that actually happening when handling array exhaustion. Adjust comments to make all of that a lot clearer. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution.
* Fix bug in nbtree array primitive scan scheduling.Peter Geoghegan2024-10-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A bug in nbtree's handling of primitive index scan scheduling could lead to wrong answers when a scrollable cursor was used with an index scan that had a SAOP index qual. Wrong answers were only possible when the scan direction changed after a primitive scan was scheduled, but before _bt_next was asked to fetch the next tuple in line (i.e. for things to break, _bt_next had to be denied the opportunity to step off the page in the same direction as the one used when the primscan was scheduled). Furthermore, the issue only occurred when the page in question happened to be the first page to be visited by the entire top-level scan; the issue hinged upon the cursor backing up to the absolute beginning of the key space that it returns tuples from (fetching in the opposite scan direction across a "primitive scan boundary" always worked correctly). To fix, make _bt_next unset the "needs primitive index scan" flag when it detects that the current scan direction is not the one that was used by _bt_readpage back when the primitive scan in question was scheduled. This fixes the cases that are known to be faulty, and also seems like a good idea on general robustness grounds. Affected scrollable cursor cases now avoid a spurious primitive index scan when they fetch backwards to the absolute start of the key space to be visited by their cursor. Fetching backwards now only returns those tuples at the start of the scan, as expected. It'll also be okay to once again fetch forwards from the start at that point, since the scan will be left in a state that's exactly consistent with the state it was in before any tuples were ever fetched, as expected. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wznv49bFsE2jkt4GuZ0tU2C91dEST=50egzjY2FeOcHL4Q@mail.gmail.com Backpatch: 17-, where commit 5bf748b8 first appears.
* Fix some more bugs in foreign keys connecting partitioned tablesÁlvaro Herrera2024-10-30
| | | | | | | | | | | | | | | | | | | | | | | | * In DetachPartitionFinalize() we were applying a tuple conversion map to tuples that didn't need one, which can lead to erratic behavior if a partitioned table has a partition with a different column order, as reported by Alexander Lakhin. This was introduced by 53af9491a043. Don't do that. Also, modify a recently added test case to exercise this. * The same function as well as CloneFkReferenced() were acquiring AccessShareLock on a partition, only to have CreateTrigger() later acquire ShareRowExclusiveLock on it. This can lead to deadlock by lock escalation, unnecessarily. Avoid that by acquiring the stronger lock to begin with. This probably dates back to branch 12, but I have never seen a report of this being a problem in the field. * Innocuous but wasteful: also introduced by 53af9491a043, we were reading a pg_constraint tuple from syscache that we don't need, as reported by Tender Wang. Don't. Backpatch to 15. Discussion: https://postgr.es/m/461e9c26-2076-8224-e119-84998b6a784e@gmail.com
* Replicate generated columns when specified in the column list.Amit Kapila2024-10-30
| | | | | | | | | | | | | | | | | | | This commit allows logical replication to publish and replicate generated columns when explicitly listed in the column list. We also ensured that the generated columns were copied during the initial tablesync when they were published. We will allow to replicate generated columns even when they are not specified in the column list (via a new publication option) in a separate commit. The motivation of this work is to allow replication for cases where the client doesn't have generated columns. For example, the case where one is trying to replicate data from Postgres to the non-Postgres database. Author: Shubham Khanna, Vignesh C, Hou Zhijie Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
* Add missing CommandCounterIncrement() in stats import functions.Jeff Davis2024-10-29
| | | | | Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/98b2fcf0-f701-369e-d63d-6be9739ce17c@gmail.com
* Unpin buffer before inplace update waits for an XID to end.Noah Misch2024-10-29
| | | | | | | | | | | | Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. By keeping the pin during that wait, a sequence of autovacuum workers and an uncommitted GRANT starved one foreground LockBufferForCleanup() for six minutes, on buildfarm member sarus. Prevent, at the cost of a bit of complexity. Back-patch to v12, like the earlier commit. That commit and heap_inplace_lock() have not yet appeared in any release. Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
* Reduce variable scope and possibly useless pallocDavid Rowley2024-10-30
| | | | | | | | Move the CreateStmt down to the branch that it's used in, thus preventing the makeNode() call in cases where the CreateStmt isn't used. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAq=06YPWPhS+yyTbCwv5JLKRz8rm3dWx6JR5Uj_d_fQDA@mail.gmail.com
* Fix dependency of partitioned table and table AM with CREATE TABLE .. USINGMichael Paquier2024-10-29
| | | | | | | | | | | | | | | | | | | | A pg_depend entry between a partitioned table and its table access method was missing when using CREATE TABLE .. USING with an unpinned access method. DROP ACCESS METHOD could be used, while it should be blocked if CASCADE is not specified, even if there was a partitioned table that depends on the table access method. pg_class.relam would then hold an orphaned OID value still pointing to the AM dropped. The problem is fixed by adding a dependency between the partitioned table and its table access method if set when the relation is created. A test checking the contents of pg_depend in this case is added. Issue introduced in 374c7a229042, that has added support for CREATE TABLE .. USING for partitioned tables. Reviewed-by: Alexander Lakhin Discussion: https://postgr.es/m/18674-1ef01eceec278fab@postgresql.org Backpatch-through: 17
* Ensure we have a snapshot when updating pg_index in index_drop().Nathan Bossart2024-10-28
| | | | | | | | | | | | | | I assumed that all index_drop() callers set an active snapshot beforehand, but that is evidently not true. One counterexample is autovacuum, which doesn't set an active snapshot when cleaning up orphan temp indexes. To fix, unconditionally push an active snapshot before updating pg_index in index_drop(). Oversight in commit b52adbad46. Reported-by: Masahiko Sawada Reviewed-by: Stepan Neretin, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBgF9etQrXbN9or_YHsmBRJHHNUEkhHp9rGK9CyQv5aTQ%40mail.gmail.com
* Strip Windows newlines from extension script files manually.Tom Lane2024-10-28
| | | | | | | | | | | | | | | | | | | | Revert commit 924e03917 in favor of adding code to convert \r\n to \n explicitly, on Windows only. The idea of letting text mode do the work fails for a couple of reasons: * Per Microsoft documentation, text mode also causes control-Z to be interpreted as end-of-file. While it may be unlikely that extension scripts contain control-Z, we've historically allowed it, and breaking the case doesn't seem wise. * Apparently, on some Windows configurations, "r" mode is interpreted as binary not text mode. We could force it with "rt" but that would be inconsistent with our code elsewhere, and it would still require Windows-specific coding. Thanks to Alexander Lakhin for investigation. Discussion: https://postgr.es/m/79284195-4993-7b00-f6df-8db28ca60fa3@gmail.com
* Fix WAL_DEBUG buildPeter Eisentraut2024-10-28
| | | | | | broken by commit e18512c000e Reported-by: Peter Geoghegan <pg@bowt.ie>
* nbtree: Minor sibling link traversal tweaks.Peter Geoghegan2024-10-28
| | | | | | | Tweak some code comments for clarity, and relocate some local variable declarations to the scope where they're actually used. Follow-up to recent commit 1bd4bc85.
* Change the default value of the streaming option to 'parallel'.Amit Kapila2024-10-28
| | | | | | | | | | | | | | | | Previously the default value of streaming option for a subscription was 'off'. The parallel option indicates that the changes in large transactions (greater than logical_decoding_work_mem) are to be applied directly via one of the parallel apply workers, if available. The parallel mode was introduced in 16, but we refrain from enabling it by default to avoid seeing any unpleasant behavior in the existing applications. However we haven't found any such report yet, so this is a good time to enable it by default. Reported-by: Vignesh C Author: Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CALDaNm1=MedhW23NuoePJTmonwsMSp80ddsw+sEJs0GUMC_kqQ@mail.gmail.com
* Set query ID for inner queries of CREATE TABLE AS and DECLAREMichael Paquier2024-10-28
| | | | | | | | | | | | | | | | | | | | | Some utility statements contain queries that can be planned and executed: CREATE TABLE AS and DECLARE CURSOR. This commit adds query ID computation for the inner queries executed by these two utility commands, with and without EXPLAIN. This change leads to four new callers of JumbleQuery() and post_parse_analyze_hook() so as extensions can decide what to do with this new data. Previously, extensions relying on the query ID, like pg_stat_statements, were not able to track these nested queries as the query_id was 0. For pg_stat_statements, this commit leads to additions under !toplevel when pg_stat_statements.track is set to "all", as shown in its regression tests. The output of EXPLAIN for these two utilities gains a "Query Identifier" if compute_query_id is enabled. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
* Fix obsolete nbtree split buffer comment.Peter Geoghegan2024-10-27
| | | | Oversight in commit d088ba5a.
* Remove unused #include's from backend .c filesPeter Eisentraut2024-10-27
| | | | | | | | as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
* Refactor the code to create a pg_locale_t into new function.Jeff Davis2024-10-25
| | | | | Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/59da7ee4-5e1a-4727-b464-a603c6ed84cd@proxel.se