aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* jit: Use opaque pointers in all supported LLVM versions.Peter Eisentraut2024-10-01
| | | | | | | | | | | | | | | | | | LLVM's opaque pointer change began in LLVM 14, but remained optional until LLVM 16. When commit 37d5babb added opaque pointer support, we didn't turn it on for LLVM 14 and 15 yet because we didn't want to risk weird bitcode incompatibility problems in released branches of PostgreSQL. (That might have been overly cautious, I don't know.) Now that PostgreSQL 18 has dropped support for LLVM versions < 14, and since it hasn't been released yet and no extensions or bitcode have been built against it in the wild yet, we can be more aggressive. We can rip out the support code and build system clutter that made opaque pointer use optional. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussions: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
* jit: Require at least LLVM 14, if enabled.Peter Eisentraut2024-10-01
| | | | | | | | | | Remove support for LLVM versions 10-13. The default on all non-EOL'd OSes represented in our build farm will be at least LLVM 14 when PostgreSQL 18 ships. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGLhNs5geZaVNj2EJ79Dx9W8fyWUU3HxcpZy55sMGcY%3DiA%40mail.gmail.com
* Fix race condition in COMMIT PREPARED causing orphaned 2PC filesMichael Paquier2024-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMMIT PREPARED removes on-disk 2PC files near its end, but the state checked if a file is on-disk or not gets read from shared memory while not holding the two-phase state lock. Because of that, there was a small window where a second backend doing a PREPARE TRANSACTION could reuse the GlobalTransaction put back into the 2PC free list by the COMMIT PREPARED, overwriting the "ondisk" flag read afterwards by the COMMIT PREPARED to decide if its on-disk two-phase state file should be removed, preventing the file deletion. This commit fixes this issue so as the "ondisk" flag in the GlobalTransaction is read while holding the two-phase state lock, not from shared memory after its entry has been added to the free list. Orphaned two-phase state files flushed to disk after a checkpoint are discarded at the beginning of recovery. However, a truncation of pg_xact/ would make the startup process issue a FATAL when it cannot read the SLRU page holding the state of the transaction whose 2PC file was orphaned, which is a necessary step to decide if the 2PC file should be removed or not. Removing manually the file would be necessary in this case. Issue introduced by effe7d9552dd, so backpatch all the way down. Mea culpa. Author: wuchengwen Discussion: https://postgr.es/m/tencent_A7F059B5136A359625C7B2E4A386B3C3F007@qq.com Backpatch-through: 12
* Expand assertion check for query ID reporting in executorMichael Paquier2024-10-01
| | | | | | | | | | | | | | | | | | As formulated, the assertion added in the executor by 24f520594809 to check that a query ID is set had two problems: - track_activities may be disabled while compute_query_id is enabled, causing the query ID to not be reported to pg_stat_activity. - debug_query_string may not be set in some context. The only path where this would matter is visibly autovacuum, should parallel workers be enabled there at some point. This is not the case currently. There was no test showing the interactions between the query ID and track_activities, so let's add one based on a scan of pg_stat_activity. This assertion is still an experimentation at this stage, but let's see if this shows more paths where query IDs are not properly set while they should. Discussion: https://postgr.es/m/Zvn5616oYXmpXyHI@paquier.xyz
* Add missing command for pg_maintain in commentDaniel Gustafsson2024-10-01
| | | | | | | | | The comment in pg_class_aclmask_ext() which lists the allowed commands for the pg_maintain role lacked LOCK TABLE. Reported-by: Yusuke Sugie <btsugieyuusuke@oss.nttdata.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/034d3c60f5daba1919cd90f236b2e22d@oss.nttdata.com
* Do not treat \. as an EOF marker in CSV mode for COPY IN.Tom Lane2024-09-30
| | | | | | | | | | | | | | | | | | | | | | | | Since backslash is (typically) not special in CSV data, we should not be treating \. as special either. The server historically did this to keep CSV and TEXT modes more alike and to support V2 protocol; but V2 protocol is long dead, and the inconsistency with CSV standards is annoying. Remove that behavior in CopyReadLineText, and make some minor consequent code simplifications. On the client side, we need to fix psql so that it does not check for \. except when reading data from STDIN (that is, the script source). We must do that regardless of TEXT/CSV mode or there is no way to end the COPY short of script EOF. Also, be careful not to send the \. to the server in that case. This is a small compatibility break in that other applications beside psql may need similar adjustment. Also, using an older version of psql with a v18 server may result in misbehavior during CSV-mode COPY IN. Daniel Vérité, reviewed by vignesh C, Robert Haas, and myself Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
* Remove incorrect entries in pg_walsummary's getopt_long call.Tom Lane2024-09-30
| | | | | | | | | | | For some reason this listed "-f" and "-w" as valid switches, though the code doesn't implement any such thing nor do the docs mention them. The effect of this was that if you tried to use one of these switches, you'd get an unhelpful error message. Yusuke Sugie Discussion: https://postgr.es/m/68e72a2a70f4d84c1c7847b13bcdaef8@oss.nttdata.com
* Don't disallow DROP of constraints ONLY on partitioned tablesAlvaro Herrera2024-09-30
| | | | | | | | | | | | | | | | | This restriction seems to have come about due to some fuzzy thinking: in commit 9139aa19423b we were adding a restriction against ADD constraint ONLY on partitioned tables (which is sensible) and apparently we thought the DROP case had to be symmetrical. However, it isn't, and the comments about it are mistaken about the effect it would have. Remove this limitation. There have been no reports of users bothered by this limitation, so I'm not backpatching it just yet. We can revisit this decision later, as needed. Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/202409261752.nbvlawkxsttf@alvherre.pgsql Discussion: https://postgr.es/m/7682253a-6f79-6a92-00aa-267c4c412870@lab.ntt.co.jp (about commit 9139aa19423b, previously not registered)
* Bump catalog version for change in VariableSetStmtMichael Paquier2024-09-30
| | | | | | | | Oversight in dc68515968e8, as this breaks SQL functions with a SET command. Reported-by: Tom Lane Discussion: https://postgr.es/m/1364409.1727673407@sss.pgh.pa.us
* Show values of SET statements as constants in pg_stat_statementsMichael Paquier2024-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a continuation of work like 11c34b342bd7, done to reduce the bloat of pg_stat_statements by applying more normalization to query entries. This commit is able to detect and normalize values in VariableSetStmt, resulting in: SET conf_param = $1 Compared to other parse nodes, VariableSetStmt is embedded in much more places in the parser, impacting many query patterns in pg_stat_statements. A custom jumble function is used, with an extra field in the node to decide if arguments should be included in the jumbling or not, a location field being not enough for this purpose. This approach allows for a finer tuning. Clauses relying on one or more keywords are not normalized, for example: * DEFAULT * FROM CURRENT * List of keywords. SET SESSION CHARACTERISTICS AS TRANSACTION, where it is critical to differentiate different sets of options, is a good example of why normalization should not happen. Some queries use VariableSetStmt for some subclauses with SET, that also have their values normalized: - ALTER DATABASE - ALTER ROLE - ALTER SYSTEM - CREATE/ALTER FUNCTION ba90eac7a995 has added test coverage for most of the existing SET patterns. The expected output of these tests shows the difference this commit creates. Normalization could be perhaps applied to more portions of the grammar but what is done here is conservative, and good enough as a starting point. Author: Greg Sabino Mullane, Michael Paquier Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
* Add num_done counter to the pg_stat_checkpointer view.Fujii Masao2024-09-30
| | | | | | | | | | | | | | | Checkpoints can be skipped when the server is idle. The existing num_timed and num_requested counters in pg_stat_checkpointer track both completed and skipped checkpoints, but there was no way to count only the completed ones. This commit introduces the num_done counter, which tracks only completed checkpoints, making it easier to see how many were actually performed. Bump catalog version. Author: Anton A. Melnikov Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
* reindexdb: Skip reindexing temporary tables and indexes.Fujii Masao2024-09-30
| | | | | | | | | | | | | | | | | | | | Reindexing temp tables or indexes of other sessions is not allowed. However, reindexdb in parallel mode previously listed them as the objects to process, leading to failures. This commit ensures reindexdb in parallel mode skips temporary tables and indexes by adding a condition based on the relpersistence column in pg_class to the object listing queries, preventing these issues. Note that this commit does not affect reindexdb when temporary tables or indexes are explicitly specified using the -t or -j options; reindexdb in that case still does not skip them and can cause an error. Back-patch to v13 where parallel mode was introduced in reindexdb. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/5f37ee56-14fb-44fe-9150-9eb97e10538b@oss.nttdata.com
* Set query ID in parallel workers for vacuum, BRIN and btreeMichael Paquier2024-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | All these code paths use their own entry point when starting parallel workers, but failed to set a query ID, even if they set a text query. Hence, this data would be missed in pg_stat_activity for the worker processes. The main entry point for parallel query processing, ParallelQueryMain(), is already doing that by saving its query ID in a dummy PlannedStmt, but not the others. The code is changed so as the query ID of these queries is set in their shared state, and reported back once the parallel workers start. Some tests are added to show how the failures can happen for btree and BRIN with a parallel build enforced, which are able to trigger a failure in an assertion added by 24f520594809 in the recovery TAP test 027_stream_regress.pl where pg_stat_statements is always loaded. In this case, the executor path was taken because the index expression needs to be flattened when building its IndexInfo. Alexander Lakhin has noticed the problem in btree, and I have noticed that the issue was more spread. This is arguably a bug, but nobody has complained about that until now, so no backpatch is done out of caution. If folks would like to see a backpatch, well, let me know. Reported-by: Alexander Lakhin Reviewed-by: Sami Imseih Discussion: https://postgr.es/m/cf3547c1-498a-6a61-7b01-819f902a251f@gmail.com
* Remove NULL dereference from RenameRelationInternal().Noah Misch2024-09-29
| | | | | | Defect in last week's commit aac2c9b4fde889d13f859c233c2523345e72d32b, per Coverity. Reaching this would need catalog corruption. Back-patch to v12, like that commit.
* In passwordFromFile, don't leak the open file after stat failures.Tom Lane2024-09-29
| | | | Oversight in e882bcae0. Per Coverity.
* Avoid 037_invalid_database.pl hang under debug_discard_caches.Noah Misch2024-09-27
| | | | Back-patch to v12 (all supported versions).
* Recalculate where-needed data accurately after a join removal.Tom Lane2024-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, remove_rel_from_query() has done a pretty shoddy job of updating our where-needed bitmaps (per-Var attr_needed and per-PlaceHolderVar ph_needed relid sets). It removed direct mentions of the to-be-removed baserel and outer join, which is the minimum amount of effort needed to keep the data structures self-consistent. But it didn't account for the fact that the removed join ON clause probably mentioned Vars of other relations, and those Vars might now not be needed as high up in the join tree as before. It's easy to show cases where this results in failing to remove a lower outer join that could also have been removed. To fix, recalculate the where-needed bitmaps from scratch after each successful join removal. This sounds expensive, but it seems to add only negligible planner runtime. (We cheat a little bit by preserving "relation 0" entries in the bitmaps, allowing us to skip re-scanning the targetlist and HAVING qual.) The submitted test case drew attention because we had successfully optimized away the lower join prior to v16. I suspect that that's somewhat accidental and there are related cases that were never optimized before and now can be. I've not tried to come up with one, though. Perhaps we should back-patch this into v16 and v17 to repair the performance regression. However, since it took a year for anyone to notice the problem, it can't be affecting too many people. Let's let the patch bake awhile in HEAD, and see if we get more complaints. Per bug #18627 from Mikaël Gourlaouen. No back-patch for now. Discussion: https://postgr.es/m/18627-44f950eb6a8416c2@postgresql.org
* Reindent pg_verifybackup.c.Robert Haas2024-09-27
|
* pg_verifybackup: Verify tar-format backups.Robert Haas2024-09-27
| | | | | | | | This also works for compressed tar-format backups. However, -n must be used, because we use pg_waldump to verify WAL, and it doesn't yet know how to verify WAL that is stored inside of a tarfile. Amul Sul, reviewed by Sravan Kumar and by me, and revised by me.
* Fix typo in pg_walsummary/nls.mk.Fujii Masao2024-09-27
| | | | | Author: Koki Nakamura Discussion: https://postgr.es/m/485c613d1db8de2e8169d5afd43e7f9e@oss.nttdata.com
* Fix incorrect memory access in VACUUM FULL with invalid toast indexesMichael Paquier2024-09-27
| | | | | | | | | | | | | | | | | | | | | | | | An invalid toast index is skipped in reindex_relation(). These would be remnants of a failed REINDEX CONCURRENTLY and they should never been rebuilt as there can only be one valid toast index at a time. REINDEX_REL_SUPPRESS_INDEX_USE, used by CLUSTER and VACUUM FULL, needs to maintain a list of the indexes being processed. The list of indexes is retrieved from the relation cache, and includes invalid indexes. The code has missed that invalid toast indexes are ignored in reindex_relation() as this leads to a hard failure in reindex_index(), and they were left in the reindex pending list, making the list inconsistent when rechecked. The incorrect memory access was happening when scanning pg_class for the refresh of pg_database.datfrozenxid, when doing a scan of pg_class. This issue exists since REINDEX CONCURRENTLY exists, where invalid toast indexes can exist, so backpatch all the way down. Reported-by: Alexander Lakhin Author: Tender Wang Discussion: https://postgr.es/m/18630-9aed99c38830657d@postgresql.org Backpatch-through: 12
* Fix catalog data of new LO privilege functionsMichael Paquier2024-09-27
| | | | | | | | | | | | | This commit improves the catalog data in pg_proc for the three functions for has_largeobject_privilege(), introduced in 4eada203a5a8: - Fix their descriptions (typos and consistency). - Reallocate OIDs to be within the 8000-9999 range as required by a6417078c414. Bump catalog version. Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/ZvUYR0V0dzWaLnsV@paquier.xyz
* Ensure we have a snapshot when updating pg_index entries.Nathan Bossart2024-09-26
| | | | | | | | | | | | | | | | | | | Creating, reindexing, and dropping an index concurrently could entail accessing pg_index's TOAST table, which was recently added in commit b52c4fc3c0. These code paths start and commit their own transactions, but they do not always set an active snapshot. This rightfully leads to assertion failures and ERRORs when trying to access pg_index's TOAST table, such as the following: ERROR: cannot fetch toast data without an active snapshot To fix, push an active snapshot just before each section of code that might require accessing pg_index's TOAST table, and pop it shortly afterwards. Reported-by: Alexander Lakhin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/a97d7401-e7c9-f771-6a00-037379f0a8bb%40gmail.com
* Improve style of pg_upgrade task callback functions.Nathan Bossart2024-09-26
| | | | | | | | | | | | | | | | I wanted to avoid adjusting this code too much when converting these tasks to use the new parallelization framework (see commit 40e2e5e92b), which is why this is being done as a follow-up commit. These stylistic adjustments result in fewer lines of code and fewer levels of indentation in some places. While at it, add names to the UpgradeTaskSlotState enum and the UpgradeTaskSlot struct. I'm not aware of any established project policy in this area, but let's at least be consistent within the same file. Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/ZunW7XHLd2uTts4f%40nathan
* Modernize to_char's Roman-numeral code, fixing overflow problems.Tom Lane2024-09-26
| | | | | | | | | | | | | | | int_to_roman() only accepts plain "int" input, which is fine since we're going to produce '###############' for any value above 3999 anyway. However, the numeric and int8 variants of to_char() would throw an error if the given input exceeded the integer range, while the float-input variants invoked undefined-per-C-standard behavior. Fix things so that you uniformly get '###############' for out of range input. Also add test cases covering this code, plus the equally-untested EEEE, V, and PL format codes. Discussion: https://postgr.es/m/2956175.1725831136@sss.pgh.pa.us
* Update oid for pg_wal_replay_wait() procedureAlexander Korotkov2024-09-26
| | | | | | | Use an oid from 8000-9999 range, as required by 98eab30b93d5. Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZvUY6bfTwB0GsyzP%40paquier.xyz
* Remove extra whitespace in pg_upgrade status message.Nathan Bossart2024-09-25
| | | | | | | | | | | There's no need to add another level of indentation to this status message. pg_log() will put it in the right place. Oversight in commit 347758b120. Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/ZunW7XHLd2uTts4f%40nathan Backpatch-through: 17
* Turn 'if' condition around to avoid Svace complaintAlvaro Herrera2024-09-25
| | | | | | | | | | | | | | | | | The unwritten assumption of this code is that both events->head and events->tail are NULL together (an empty list) or they aren't. So the code was testing events->head for nullness and using that as a cue to deference events->tail, which annoys the Svace static code analyzer. We can silence it by testing events->tail member instead, and add an assertion about events->head to ensure it's all consistent. This code is very old and as far as we know, there's never been a bug report related to this, so there's no need to backpatch. This was found by the ALT Linux Team using Svace. Author: Alexander Kuznetsov <kuznetsovam@altlinux.org> Discussion: https://postgr.es/m/6d0323c3-3f5d-4137-af73-98a5ab90e77c@altlinux.org
* vacuumdb: Skip temporary tables in query to build list of relationsMichael Paquier2024-09-25
| | | | | | | | | | | | | | | | | Running vacuumdb with a non-superuser while another user has created a temporary table would lead to a mid-flight permission failure, interrupting the operation. vacuum_rel() skips temporary relations of other backends, and it makes no sense for vacuumdb to know about these relations, so let's switch it to ignore temporary relations entirely. Adding a qual in the query based on relpersistence simplifies the generation of its WHERE clause in vacuum_one_database(), per se the removal of "has_where". Author: VaibhaveS, Michael Paquier Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/CAM_eQjwfAR=y3G1fGyS1U9FTmc+FyJm9amNfY2QCZBnDDbNPZg@mail.gmail.com Backpatch-through: 12
* For inplace update durability, make heap_update() callers wait.Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
* Fix data loss at inplace update after heap_update().Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | | | | As previously-added tests demonstrated, heap_inplace_update() could instead update an unrelated tuple of the same catalog. It could lose the update. Losing relhasindex=t was a source of index corruption. Inplace-updating commands like VACUUM will now wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a long-running GRANT already hurts VACUUM progress more just by keeping an XID running. The VACUUM will behave like a DELETE or UPDATE waiting for the uncommitted change. For implementation details, start at the systable_inplace_update_begin() header comment and README.tuplock. Back-patch to v12 (all supported versions). In back branches, retain a deprecated heap_inplace_update(), for extensions. Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier versions) Heikki Linnakangas, and (in earlier versions) Alexander Lakhin. Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
* Warn if LOCKTAG_TUPLE is held at commit, under debug_assertions.Noah Misch2024-09-24
| | | | | | | | | | | | The current use always releases this locktag. A planned use will continue that intent. It will involve more areas of code, making unlock omissions easier. Warn under debug_assertions, like we do for various resource leaks. Back-patch to v12 (all supported versions), the plan for the commit of the new use. Reviewed by Heikki Linnakangas. Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
* Allow length=-1 for NUL-terminated input to pg_strncoll(), etc.Jeff Davis2024-09-24
| | | | | | | | | Like ICU, allow a length of -1 to be specified for NUL-terminated arguments to pg_strncoll(), pg_strnxfrm(), and pg_strnxfrm_prefix(). Simplifies the code and comments. Discussion: https://postgr.es/m/2d758e07dff26bcc7cbe2aec57431329bfe3679a.camel@j-davis.com
* Fix psql describe commands' handling of ACL columns for old servers.Tom Lane2024-09-24
| | | | | | | | | | | Commit d1379ebf4 carelessly broke printACLColumn for pre-9.4 servers, by using the cardinality() function which we introduced in 9.4. We expect psql's describe-related commands to work back to 9.2, so this is bad. Use the longstanding array_length() function instead. Per report from Christoph Berg. Back-patch to v17. Discussion: https://postgr.es/m/ZvLXYglRS6hMMhtr@msg.df7cb.de
* Tighten up make_libc_collator() and make_icu_collator().Jeff Davis2024-09-24
| | | | | | | | | | | | | | | Ensure that error paths within these functions do not leak a collator, and return the result rather than using an out parameter. (Error paths in the caller may still result in a leaked collator, which will be addressed separately.) In make_libc_collator(), if the first newlocale() succeeds and the second one fails, close the first locale_t object. The function make_icu_collator() doesn't have any external callers, so change it to be static. Discussion: https://postgr.es/m/54d20e812bd6c3e44c10eddcd757ec494ebf1803.camel@j-davis.com
* Add further excludes to headerscheckPeter Eisentraut2024-09-24
| | | | | | | | | Some header files under contrib/isn/ are not meant to be included independently, and they fail -Wmissing-variable-declarations when doing so. Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BYVt5MBD-w0HyHpsGb4U8RNge3DvAbDmOFy_epGhZ2Mg%40mail.gmail.com#aba3226c6dd493923bd6ce95d25a2d77
* Neaten up our choices of SQLSTATEs for XML-related errors.Tom Lane2024-09-24
| | | | | | | | | | | | | | | | | | | | | When our XML-handling modules were first written, the SQL standard lacked any error codes that were particularly intended for XML error conditions. Unsurprisingly, this led to some rather random choices of errcodes in those modules. Now the standard has a whole SQLSTATE class, "Class 10 - XQuery Error", with a reasonably large selection of relevant-looking errcodes. In this patch I've chosen one fairly generic code defined by the standard, 10608 = invalid_argument_for_xquery, and used it where it seemed appropriate. I've also made an effort to replace ERRCODE_INTERNAL_ERROR everywhere it was not clearly reporting a coding problem; in particular, many of the existing uses look like they can fairly be reported as ERRCODE_OUT_OF_MEMORY. It might be interesting to try to map libxml2's error codes into the standard's new collection, but I've not undertaken that here. Discussion: https://postgr.es/m/417250.1726341268@sss.pgh.pa.us
* Update obsolete nbtree array preprocessing comments.Peter Geoghegan2024-09-24
| | | | | | | | | | | | The array->scan_key references fixed up at the end of preprocessing start out as offsets into the arrayKeyData[] array (the array returned by _bt_preprocess_array_keys at the start of preprocessing that involves array scan keys). Offsets into the arrayKeyData[] array are no longer guaranteed to be valid offsets into our original scan->keyData[] input scan key array, but comments describing the array->scan_key references still talked about scan->keyData[]. Update those comments. Oversight in commit b5249741.
* Add ONLY support for VACUUM and ANALYZEDavid Rowley2024-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since autovacuum does not trigger an ANALYZE for partitioned tables, users must perform these manually. However, performing a manual ANALYZE on a partitioned table would always result in recursively analyzing each partition and that could be undesirable as autovacuum takes care of that. For partitioned tables that contain a large number of partitions, having to analyze each partition could take an unreasonably long time, especially so for tables with a large number of columns. Here we allow the ONLY keyword to prefix the name of the table to allow users to have ANALYZE skip processing partitions. This option can also be used with VACUUM, but there is no work to do if VACUUM ONLY is used on a partitioned table. This commit also changes the behavior of VACUUM and ANALYZE for inheritance parents. Previously inheritance child tables would not be processed when operating on the parent. Now, by default we *do* operate on the child tables. ONLY can be used to obtain the old behavior. The release notes should note this as an incompatibility. The default behavior has not changed for partitioned tables as these always recursively processed the partitions. Author: Michael Harris <harmic@gmail.com> Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com Discussion: https://postgr.es/m/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com>
* Remove ATT_TABLE for ALTER TABLE ... ATTACH/DETACHMichael Paquier2024-09-24
| | | | | | | | | | | | | | | Attempting these commands for a non-partitioned table would result in a failure when creating the relation in transformPartitionCmd(). This gives the possibility to throw an error earlier with a much better error message, thanks to d69a3f4d70b7. The extra test cases are from me. Note that FINALIZE uses a different subcommand and it had no coverage for its failure path with non-partitioned tables. Author: Álvaro Herrera, Michael Paquier Reviewed-by: Nathan Bossart Discussion: https://postgr.es/m/202409190803.tnis52adt2n5@alvherre.pgsql
* jsonapi: fix memory leakage during OOM error recovery.Tom Lane2024-09-23
| | | | | | | | | | Coverity pointed out that inc_lex_level() would leak memory (not to mention corrupt the pstack data structure) if some but not all of its three REALLOC's failed. To fix, store successfully-updated pointers back into the pstack struct immediately. Oversight in 0785d1b8b, so no need for back-patch.
* Fix asserts in fast-path locking codeTomas Vondra2024-09-23
| | | | | | | | | | | | | | | | | | | Commit c4d5cb71d229 introduced a couple asserts in the fast-path locking code, upsetting Coverity. The assert in InitProcGlobal() is clearly wrong, as it assigns instead of checking the value. This is harmless, but doesn't check anything. The asserts in FAST_PATH_ macros are written as if for signed values, but the macros are only called for unsigned ones. That makes the check for (val >= 0) useless. Checks written as ((uint32) x < max) work for both signed and unsigned values. Negative values should wrap to values greater than INT32_MAX. Per Coverity, report by Tom Lane. Reported-by: Tom Lane Discussion: https://postgr.es/m/2891628.1727019959@sss.pgh.pa.us
* Add memory/disk usage for more executor nodes.Tatsuo Ishii2024-09-23
| | | | | | | | | | | | | | | This commit is similar to 95d6e9af07, expanding the idea to CTE scan, table function scan and recursive union scan nodes so that the maximum tuplestore memory or disk usage is shown with EXPLAIN ANALYZE command. Also adjust show_storage_info() so that it accepts storage type and storage size arguments instead of Tuplestorestate. This allows the node types to share the formatting code using show_storage_info(). Due to this show_material_info() and show_windowagg_info() are also modified. Reviewed-by: David Rowley Discussion: https://postgr.es/m/20240918.211246.1127161704188186085.ishii%40postgresql.org
* Remove pg_authid's TOAST table.Nathan Bossart2024-09-21
| | | | | | | | | | | | | | | | | pg_authid's only varlena column is rolpassword, which unfortunately cannot be de-TOASTed during authentication because we haven't selected a database yet and cannot read pg_class. By removing pg_authid's TOAST table, attempts to set password hashes that require out-of-line storage will fail with a "row is too big" error instead. We may want to provide a more user-friendly error in the future, but for now let's just remove the useless TOAST table. Bumps catversion. Reported-by: Alexander Lakhin Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com
* Increase the number of fast-path lock slotsTomas Vondra2024-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the fixed-size array of fast-path locks with arrays, sized on startup based on max_locks_per_transaction. This allows using fast-path locking for workloads that need more locks. The fast-path locking introduced in 9.2 allowed each backend to acquire a small number (16) of weak relation locks cheaply. If a backend needs to hold more locks, it has to insert them into the shared lock table. This is considerably more expensive, and may be subject to contention (especially on many-core systems). The limit of 16 fast-path locks was always rather low, because we have to lock all relations - not just tables, but also indexes, views, etc. For planning we need to lock all relations that might be used in the plan, not just those that actually get used in the final plan. So even with rather simple queries and schemas, we often need significantly more than 16 locks. As partitioning gets used more widely, and the number of partitions increases, this limit is trivial to hit. Complex queries may easily use hundreds or even thousands of locks. For workloads doing a lot of I/O this is not noticeable, but for workloads accessing only data in RAM, the access to the shared lock table may be a serious issue. This commit removes the hard-coded limit of the number of fast-path locks. Instead, the size of the fast-path arrays is calculated at startup, and can be set much higher than the original 16-lock limit. The overall fast-path locking protocol remains unchanged. The variable-sized fast-path arrays can no longer be part of PGPROC, but are allocated as a separate chunk of shared memory and then references from the PGPROC entries. The fast-path slots are organized as a 16-way set associative cache. You can imagine it as a hash table of 16-slot "groups". Each relation is mapped to exactly one group using hash(relid), and the group is then processed using linear search, just like the original fast-path cache. With only 16 entries this is cheap, with good locality. Treating this as a simple hash table with open addressing would not be efficient, especially once the hash table gets almost full. The usual remedy is to grow the table, but we can't do that here easily. The access would also be more random, with worse locality. The fast-path arrays are sized using the max_locks_per_transaction GUC. We try to have enough capacity for the number of locks specified in the GUC, using the traditional 2^n formula, with an upper limit of 1024 lock groups (i.e. 16k locks). The default value of max_locks_per_transaction is 64, which means those instances will have 64 fast-path slots. The main purpose of the max_locks_per_transaction GUC is to size the shared lock table. It is often set to the "average" number of locks needed by backends, with some backends using significantly more locks. This should not be a major issue, however. Some backens may have to insert locks into the shared lock table, but there can't be too many of them, limiting the contention. The only solution is to increase the GUC, even if the shared lock table already has sufficient capacity. That is not free, especially in terms of memory usage (the shared lock table entries are fairly large). It should only happen on machines with plenty of memory, though. In the future we may consider a separate GUC for the number of fast-path slots, but let's try without one first. Reviewed-by: Robert Haas, Jakub Wartak Discussion: https://postgr.es/m/510b887e-c0ce-4a0c-a17a-2c6abb8d9a5c@enterprisedb.com
* Refactor handling of nbtree array redundancies.Peter Geoghegan2024-09-21
| | | | | | | | | | | | | | | | | | | | | Teach _bt_preprocess_array_keys to eliminate redundant array equality scan keys directly, rather than just marking them as redundant. Its _bt_preprocess_keys caller is no longer required to ignore input scan keys that were marked redundant in this way. Oversights like the one fixed by commit f22e17f7 are no longer possible. The new scheme also makes it easier for _bt_preprocess_keys to output a so.keyData[] scan key array with _more_ scan keys than it was passed in its scan.keyData[] input scan key array. An upcoming patch that adds skip scan optimizations to nbtree will take advantage of this. In passing, remove and rename certain _bt_preprocess_keys variables to make the difference between our input scan key array and our output scan key array clearer. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com
* Improve Asserts checking relation matching in parallel scans.Tom Lane2024-09-20
| | | | | | | | | | | | | | | | table_beginscan_parallel and index_beginscan_parallel contain Asserts checking that the relation a worker will use in a parallel scan is the same one the leader intended. However, they were checking for relation OID match, which was not strong enough to detect the mismatch problem fixed in 126ec0bc7. What would be strong enough is to compare relfilenodes instead. Arguably, that's a saner definition anyway, since a scan surely operates on a physical relation not a logical one. Hence, store and compare RelFileLocators not relation OIDs. Also ensure that index_beginscan_parallel checks the index identity not just the table identity. Discussion: https://postgr.es/m/2127254.1726789524@sss.pgh.pa.us
* Alphabetize #include directives in pg_checksums.c.Nathan Bossart2024-09-20
| | | | | Author: Michael Banck Discussion: https://postgr.es/m/66edaed0.050a0220.32a9ba.42c8%40mx.google.com
* Fix nbtree pgstats accounting with parallel scans.Peter Geoghegan2024-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made parallel index scans work with the new design for arrays via explicit scheduling of primitive index scans. Under this scheme a parallel index scan with array keys will perform the same number of index descents as an equivalent serial index scan (barring corner cases where an individual parallel worker discovers that it can advance the scan's array keys without anybody needing to perform another descent of the index to get to the relevant page on the leaf level). Despite all this, the pgstats accounting wasn't updated; it continued to increment the total number of index scans for the rel once per _bt_first call, no matter the details. As a result, the number of (primitive) index scans could be over-counted during parallel scans. To fix, delay incrementing the count of index scans until after we've established that another descent of the index (using either _bt_search or _bt_endpoint) is required. That way pg_stat_user_tables.idx_scan always advances in the same way, regardless of whether or not the scan makes use of parallelism. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wz=E7XrkvscBN0U6V81NK3Q-dQOmivvbEsjG-zwEfDdFpg@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com Backpatch: 17-, where nbtree SAOP execution was enhanced.
* Add parameter "connstr" to PostgreSQL::Test::Cluster::background_psqlMichael Paquier2024-09-20
| | | | | | | | Like for Cluster::psql, this can be handy to force the use of a connection string with some values overriden, like a "host". Author: Aidar Imamov Discussion: https://postgr.es/m/ecacb079efc533aed3c234cbcb5b07b6@postgrespro.ru