aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Update outdated comments in nodeAgg.c.Jeff Davis2025-02-18
| | | | | | Author: Zhang Mingli Reviewed-by: Richard Guo Discussion: https://postgr.es/m/198a8d1e-0792-4e7f-828e-902aa342f36e@Spark
* Reduce scope of heap vacuum per_buffer_dataMelanie Plageman2025-02-18
| | | | | | | | | | | | | | | Move lazy_scan_heap()'s per_buffer_data variable into a tighter scope. In lazy_scan_heap()'s phase I heap vacuuming, the read stream API returns a pointer to the next block number to vacuum. As long as read_stream_next_buffer() returns a valid buffer, per_buffer_data should always be valid. Move per_buffer_data into a tighter scope and make sure it is reset to NULL on each iteration so that we get a core dump instead of bogus data from a previous block if something goes wrong in the read stream API. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/626104.1739729538%40sss.pgh.pa.us
* Add PGErrorVerbosity to typedefs.listDaniel Gustafsson2025-02-18
| | | | | | | | | | PGErrorVerbosity was missing which resulted in incorrect whitespace alignment going back all the way to e3860ffa4dd0. No backpatch for this though since we don't pgindent backbranches. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAGECzQTVi8n-HW4Q27je-b9ckQk7zf6bS_it42gNvQu+DX0NCQ@mail.gmail.com
* Fix poorly written regression testDavid Rowley2025-02-19
| | | | | | | | | | | | | | | | | | | bd10ec529 added code to allow redundant functionally dependent GROUP BY columns to be removed using unique indexes and NOT NULL constraints as proofs of functional dependency. In that commit, I (David) added a test to ensure that when there are multiple indexes available to remove columns that we pick the index that allows us to remove the most columns. This test was faulty as it assumed the t3 table's primary key index was valid to use as functional dependency proof, but that's not the case since that's defined as deferrable. Here we adjust the tests added by that commit to use the t2 table instead. That's defined with a non-deferrable primary key. Author: songjinzhou <tsinghualucky912@foxmail.com> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/tencent_CD414C79D39668455DF80D35143B87634C08@qq.com
* Raise a WARNING for max_slot_wal_keep_size in pg_createsubscriber.Amit Kapila2025-02-18
| | | | | | | | | | | | | | | | During the pg_createsubscriber execution, it is possible that the required WAL is removed from the primary/publisher node due to 'max_slot_wal_keep_size'. This patch raises a WARNING during the '--dry-run' mode if the 'max_slot_wal_keep_size' is set to a non-default value on the primary/publisher node. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAHv8Rj+deqsQXOMa7Tck8CBQUbsua=+4AuMVQ2=MPM0f-ZHbjA@mail.gmail.com
* Fix typo in 2a8a0067.Thomas Munro2025-02-18
| | | | | | Builds configured with Valgrind but without assertions would fail due to a typo in the recent change. This should be included when back-patching 2a8a0067 into v17.
* Fix translator notes in commentsDaniel Gustafsson2025-02-17
| | | | | | | | | | | | | The translator comments detailing what a %s inclusion refers to were accidentally including too many address types. In practice this is not a problem since it's not a translated string, but to minimize any risk of confusion let's fix them anwyays. Even though this exists in backbranches there is little use for backpatch as the translation work has already happened there, so let's avoid the churn. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/ME0P300MB04458DE627480614ABE639D2B6FB2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
* Add tab completion for ALTER USER/ROLE RESETTomas Vondra2025-02-17
| | | | | | | | | | | Currently tab completion for ALTER USER RESET shows a list of all configuration parameters that may be set on a role, irrespectively of which parameters are actually set. This patch improves tab completion to offer only parameters that are set. Author: Robins Tharakan Reviewed-By: Tomas Vondra Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
* Add tab completion for ALTER DATABASE RESETTomas Vondra2025-02-17
| | | | | | | | | | | Currently tab completion for ALTER DATABASE RESET shows a list of all configuration parameters that may be set on a database, irrespectively of which parameters are actually set. This patch improves tab completion to offer only parameters that are set. Author: Robins Tharakan Reviewed-By: Tomas Vondra Discussion: https://postgr.es/m/CAEP4nAzqiT6VbVC5r3nq5byLTnPzjniVGzEMpYcnAHQyNzEuaw%40mail.gmail.com
* Implement Self-Join EliminationAlexander Korotkov2025-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Self-Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if it is proven that the join can be replaced with a scan without impacting the query result. Self-join and inner relation get replaced with the outer in query, equivalence classes, and planner info structures. Also, the inner restrictlist moves to the outer one with the removal of duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses and, in turn, selectivity estimations, and potentially improves total planner prediction for the query. This feature is dedicated to avoiding redundancy, which can appear after pull-up transformations or the creation of an EquivalenceClass-derived clause like the below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3); SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x); SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x; In the future, we could also reduce redundancy caused by subquery pull-up after unnecessary outer join removal in cases like the one below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x); Also, it can drastically help to join partitioned tables, removing entries even before their expansion. The SJE proof is based on innerrel_is_unique() machinery. We can remove a self-join when for each outer row: 1. At most, one inner row matches the join clause; 2. Each matched inner row must be (physically) the same as the outer one; 3. Inner and outer rows have the same row mark. In this patch, we use the next approach to identify a self-join: 1. Collect all merge-joinable join quals which look like a.x = b.x; 2. Add to the list above the baseretrictinfo of the inner table; 3. Check innerrel_is_unique() for the qual list. If it returns false, skip this pair of joining tables; 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the possibility of self-join elimination, the inner and outer clauses must match exactly. The relation replacement procedure is not trivial and is partly combined with the one used to remove useless left joins. Tests covering this feature were added to join.sql. Some of the existing regression tests changed due to self-join removal logic. Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Author: Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Co-authored-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Simon Riggs <simon@2ndquadrant.com> Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org> Reviewed-by: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Hywel Carver <hywel@skillerwhale.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Ronan Dunklau <ronan.dunklau@aiven.io> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Greg Stark <stark@mit.edu> Reviewed-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Michał Kłeczek <michal@kleczek.org> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
* Revert: Get rid of WALBufMappingLockAlexander Korotkov2025-02-17
| | | | | This commit reverts 6a2275b895. Buildfarm failure on batta spots some concurrency issue, which requires further investigation.
* Fix an oversight in cbc127917 to handle MERGE correctlyAmit Langote2025-02-17
| | | | | | | | | | | | | ExecInitModifyTable() forgot to trim MERGE-related lists to exclude entries for result relations pruned during initial pruning, so fix that. While at it, make the function's use of the pruned resultRelations list, rather than ModifyTable.resultRelations, more consistent. Reported-by: Alexander Lakhin <exclusion@gmail.com> (via sqlsmith) Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/e72c94d9-e5f9-4753-9bc1-69d72bd54b8a@gmail.com
* Add information about WAL buffers full to VACUUM/ANALYZE (VERBOSE)Michael Paquier2025-02-17
| | | | | | | | | | | | | | | | This commit adds the information about the number of times WAL buffers have been full to the logs generated by VACUUM/ANALYZE (VERBOSE) and in the logs generated by autovacuum, complementing the existing information stored by WalUsage. This is the last part of the backend code where the value of wal_buffers_full can be reported, similarly to all the other fields of WalUsage. 320545bfcfee and ce5bcc4a9f26 have done the same for EXPLAIN and pgss. Author: Bertrand Drouvot Reviewed-by: Ilia Evdokimov Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
* Add information about WAL buffers being full to EXPLAIN (WAL)Michael Paquier2025-02-17
| | | | | | | | | | This is similar to ce5bcc4a9f26, relying on the addition of wal_buffers_full to WalUsage. This time, the information is added to the output generated by EXPLAIN (WAL). Author: Bertrand Drouvot Reviewed-by: Ilia Evdokimov Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
* Move wal_buffers_full from PgStat_PendingWalStats to WalUsageMichael Paquier2025-02-17
| | | | | | | | | | | | | | | | | | | | | | | | wal_buffers_full has been introduced in pg_stat_wal in 8d9a935965f, as some information providing metrics for the tuning of the GUC wal_buffers. WalUsage has been introduced before that in df3b181499. Moving this field is proving to be beneficial for several reasons: - This information can now be made available in more layers, providing more granularity than just pg_stat_wal, on a per-query basis: EXPLAIN, pgss and VACUUM/ANALYZE logs. - A patch is under discussion to provide statistics for WAL at backend level, and this move simplifies a bit the handling of pending statistics. The remaining data in PgStat_PendingWalStats now relates to write/sync counters and times, with equivalents present in pg_stat_io, that backend statistics are able to already track. So this should cut all the dependencies between PgStat_PendingWalStats and WAL stats at backend level. As of this change, wal_buffers_full only shows in pg_stat_wal. Author: Bertrand Drouvot Reviewed-by: Ilia Evdokimov Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
* Get rid of WALBufMappingLockAlexander Korotkov2025-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | Allow multiple backends to initialize WAL buffers concurrently. This way `MemSet((char *) NewPage, 0, XLOG_BLCKSZ);` can run in parallel without taking a single LWLock in exclusive mode. The new algorithm works as follows: * reserve a page for initialization using XLogCtl->InitializeReserved, * ensure the page is written out, * once the page is initialized, try to advance XLogCtl->InitializedUpTo and signal to waiters using XLogCtl->InitializedUpToCondVar condition variable, * repeat previous steps until we reserve initialization up to the target WAL position, * wait until concurrent initialization finishes using a XLogCtl->InitializedUpToCondVar. Now, multiple backends can, in parallel, concurrently reserve pages, initialize them, and advance XLogCtl->InitializedUpTo to point to the latest initialized page. Author: Yura Sokolov <y.sokolov@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
* Adjust tuples estimate for appendrelsRichard Guo2025-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | In set_append_rel_size(), we currently set rel->tuples to rel->rows for an appendrel. Generally, rel->tuples is the raw number of tuples in the relation and rel->rows is the estimated number of tuples after the relation's restriction clauses have been applied. Although an appendrel itself doesn't directly enforce any quals today, its child relations may. Therefore, setting rel->tuples equal to rel->rows for an appendrel isn't always appropriate. Doing so can lead to issues in cost estimates in some cases. For instance, when estimating the number of distinct values from an appendrel, we would not be able to adjust the estimate based on the restriction selectivity. This patch addresses this by setting an appendrel's tuples to the total number of tuples accumulated from each live child, which better aligns with reality. This is arguably a bug, but nobody has complained about that until now, so no back-patch. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: https://postgr.es/m/CAMbWs4_TG_+kVn6fjG-5GYzzukrNK57=g9eUo4gsrUG26OFawg@mail.gmail.com
* In fmtIdEnc(), handle failure of enlargePQExpBuffer().Tom Lane2025-02-16
| | | | | | | | | | | | | | | | | | Coverity complained that we weren't doing that, and it's right. This fix just makes fmtIdEnc() honor the general convention that OOM causes a PQExpBuffer to become marked "broken", without any immediate error. In the pretty-unlikely case that we actually did hit OOM here, the end result would be to return an empty string to the caller, probably resulting in invalid SQL syntax in an issued command (if nothing else went wrong, which is even more unlikely). It's tempting to throw an "out of memory" error if the buffer becomes broken, but there's not a lot of point in doing that only here and not in hundreds of other PQExpBuffer-using places in pg_dump and similar callers. The whole issue could do with some non-time-crunched redesign, perhaps. This is a followup to the fixes for CVE-2025-1094, and should be included if cherry-picking those fixes.
* Make escaping functions retain trailing bytes of an invalid character.Tom Lane2025-02-15
| | | | | | | | | | | | | | | | | | | | | | Instead of dropping the trailing byte(s) of an invalid or incomplete multibyte character, replace only the first byte with a known-invalid sequence, and process the rest normally. This seems less likely to confuse incautious callers than the behavior adopted in 5dc1e42b4. While we're at it, adjust PQescapeStringInternal to produce at most one bleat about invalid multibyte characters per string. This matches the behavior of PQescapeInternal, and avoids the risk of producing tons of repetitive junk if a long string is simply given in the wrong encoding. This is a followup to the fixes for CVE-2025-1094, and should be included if cherry-picking those fixes. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/20250215012712.45@rfd.leadboat.com Backpatch-through: 13
* Fix explicit valgrind interaction in read_stream.c.Thomas Munro2025-02-15
| | | | | | | | | | | | | | | | | | | | | By calling wipe_mem() on per-buffer data memory that has been released, we are also telling Valgrind that the memory is "noaccess". We need to set it to "undefined" before giving it to the registered callback to fill in, when a slot is reused. As discovered by build farm animal skink when the VACUUM streamification patches landed (the first users of per-buffer data). Pushing to master only for now, to clear the error on skink. It's also possible that external code might discover the per-buffer data feature in v17, and reasonable to expect Valgrind not to produce spurious memcheck reports, but the back-patch is deferred until after the imminent minor release is out of the way. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2Bg6aXpi2FEHqeLOzE%2BxYw%3DOV%2B-N5jhOEnnV%2BF0USM9xA%40mail.gmail.com
* Fix PQescapeLiteral()/PQescapeIdentifier() length handlingAndres Freund2025-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | In 5dc1e42b4fa I fixed bugs in various escape functions, unfortunately as part of that I introduced a new bug in PQescapeLiteral()/PQescapeIdentifier(). The bug is that I made PQescapeInternal() just use strlen(), rather than taking the specified input length into account. That's bad, because it can lead to including input that wasn't intended to be included (in case len is shorter than null termination of the string) and because it can lead to reading invalid memory if the input string is not null terminated. Expand test_escape to this kind of bug: a) for escape functions with length support, append data that should not be escaped and check that it is not b) add valgrind requests to detect access of bytes that should not be touched Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/Z64jD3u46gObCo1p@pryzbyj2023 Backpatch: 13
* Add delay time to VACUUM/ANALYZE (VERBOSE) and autovacuum logs.Nathan Bossart2025-02-14
| | | | | | | | | | | Commit bb8dff9995 added this information to the pg_stat_progress_vacuum and pg_stat_progress_analyze system views. This commit adds the same information to the output of VACUUM and ANALYZE with the VERBOSE option and to the autovacuum logs. Suggested-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
* Use PqMsg_Progress macro in HandleParallelMessage().Nathan Bossart2025-02-14
| | | | | | | Commit a99cc6c6b4 introduced the PqMsg_Progress macro but missed updating HandleParallelMessage() accordingly. Backpatch-through: 17
* Use streaming read I/O in VACUUM's third phaseMelanie Plageman2025-02-14
| | | | | | | | | | | | | Make vacuum's third phase (its second pass over the heap), which reaps dead items collected in the first phase and marks them as reusable, use the read stream API. This commit adds a new read stream callback, vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and returns the next block number to read for vacuum. Author: Melanie Plageman <melanieplageman@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
* Use streaming read I/O in VACUUM's first phaseMelanie Plageman2025-02-14
| | | | | | | | | | Make vacuum's first phase, which prunes and freezes tuples and records dead TIDs, use the read stream API by by converting heap_vac_scan_next_block() to a read stream callback. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CAAKRu_aLwANZpxHc0tC-6OT0OQT4TftDGkKAO5yigMUOv_Tcsw%40mail.gmail.com
* Convert heap_vac_scan_next_block() boolean parameters to flagsMelanie Plageman2025-02-14
| | | | | | | | | | | | | | | | The read stream API only allows one piece of extra per block state to be passed back to the API user (per_buffer_data). lazy_scan_heap() needs two pieces of per-buffer data: whether or not the block was all-visible in the visibility map and whether or not it was eagerly scanned. Convert these two pieces of information to flags so that they can be populated by heap_vac_scan_next_block() and returned to lazy_scan_heap(). A future commit will turn heap_vac_scan_next_block() into the read stream callback for heap phase I vacuuming. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CAAKRu_bmx33jTqATP5GKNFYwAg02a9dDtk4U_ciEjgBHZSVkOQ%40mail.gmail.com
* Describe special values in GUC descriptions more consistently.Nathan Bossart2025-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many GUCs accept special values like -1 or an empty string to disable the feature, use a system default, etc. While the documentation consistently lists these special values, the GUC descriptions do not. Many such descriptions fail to mention the special values, and those that do vary in phrasing and placement. This commit aims to bring some consistency to this area by applying the following rules: * Special values should be listed at the end of the long description. * Descriptions should use numerals (e.g., "0") instead of words (e.g., "zero"). * Special value mentions should be concise and direct (e.g., "0 disables the timeout.", "An empty string means use the operating system setting."). * Multiple special values should be listed in ascending order. Of course, there are exceptions, such as max_pred_locks_per_relation and search_path, whose special values are too complex to include. And there are cases like listen_addresses, where the meaning of an empty string is arguably too obvious to include. In those cases, I've refrained from adding special value information to the GUC description. Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/Z6aIy4aywxUZHAo6%40nathan
* Fix assertion on dereferenced objectDaniel Gustafsson2025-02-14
| | | | | | | | | | | | | Commit 27cc7cd2bc8a accidentally placed the assertion ensuring that the pointer isn't NULL after it had already been accessed. Fix by moving the pointer dereferencing to after the assertion. Backpatch to all supported branches. Author: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/1618848d-cdc7-414b-9c03-08cf4bef4408@postgrespro.ru Backpatch-through: 13
* Remove obsolete comment.Thomas Munro2025-02-14
| | | | | | Commit 755a4c10d19d prevented StartReadBuffers() from crossing md.c segment boundaries in one operation, but a comment about that possibility remained.
* Remove unused parameter from execute_extension_script().Nathan Bossart2025-02-13
| | | | | | | | | This function's schemaOid parameter appears to have never been used for anything. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Discussion: https://postgr.es/m/20250214010218.550ebe4ec1a7c7811a7fa2bb%40sraoss.co.jp
* Remove unnecessary (char *) casts [xlog]Peter Eisentraut2025-02-13
| | | | | | | | Remove (char *) casts no longer needed after XLogRegisterData() and XLogRegisterBufData() argument type change. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* XLogRegisterData, XLogRegisterBufData void * argument for binary dataPeter Eisentraut2025-02-13
| | | | | | | | | Change XLogRegisterData() and XLogRegisterBufData() functions to take void * for binary data instead of char *. This will remove the need for numerous casts (done in a separate commit for clarity). Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* Fix MakeTransitionCaptureState() to return a consistent resultMichael Paquier2025-02-13
| | | | | | | | | | | | | | | | | | | | | | | When an UPDATE trigger referencing a new table and a DELETE trigger referencing an old table are both present, MakeTransitionCaptureState() returns an inconsistent result for UPDATE commands in its set of flags and tuplestores holding the TransitionCaptureState for transition tables. As proved by the test added here, this issue causes a crash in v14 and earlier versions (down to 11, actually, older versions do not support triggers on partitioned tables) during cross-partition updates on a partitioned table. v15 and newer versions are safe thanks to 7103ebb7aae8. This commit fixes the function so that it returns a consistent state by using portions of the changes made in commit 7103ebb7aae8 for v13 and v14. v15 and newer versions are slightly tweaked to match with the older versions, mainly for consistency across branches. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20250207.150238.968446820828052276.horikyota.ntt@gmail.com Backpatch-through: 13
* Rename RBTXN_PREPARE to RBTXN_IS_PREPARE for better clarification.Masahiko Sawada2025-02-12
| | | | | | | | | | | | | | | | | RBTXN_PREPARE flag and rbtxn_prepared macro could be misinterpreted as either indicating the transaction type (e.g. a prepared transaction or a normal transaction) or its currentstate (e.g. skipped or its prepare message is sent), especially after commit 072ee847ad4 introduced the RBTXN_SENT_PREPARE flag and the rbtxn_sent_prepare macro. The RBTXN_PREPARE flag (and its corresponding macro) have been renamed to RBTXN_IS_PREPARE to explicitly indicate the transaction type. Therefore, this commit also adds the RBTXN_IS_PREPARE flag to the transaction that is a prepared transaction and has been skipped, which previously had only the RBTXN_SKIPPED_PREPARE flag. Reviewed-by: Amit Kapila, Peter Smith Discussion: https://postgr.es/m/CAA4eK1KgNmBsG%3D155E7QQ6TX9RoWnM4z5Z20SvsbwxSe_QXYsg%40mail.gmail.com
* Skip logical decoding of already-aborted transactions.Masahiko Sawada2025-02-12
| | | | | | | | | | | | | | | | | | | | | | Previously, transaction aborts were detected concurrently only during system catalog scans while replaying a transaction in streaming mode. This commit adds an additional CLOG lookup to check the transaction status, allowing the logical decoding to skip changes also when it doesn't touch system catalogs, if the transaction is already aborted. This optimization enhances logical decoding performance, especially for large transactions that have already been rolled back, as it avoids unnecessary disk or network I/O. To avoid potential slowdowns caused by frequent CLOG lookups for small transactions (most of which commit), the CLOG lookup is performed only for large transactions before eviction. The performance benchmark results showed there is not noticeable performance regression due to CLOG lookups. Reviewed-by: Amit Kapila, Peter Smith, Vignesh C, Ajin Cherian Reviewed-by: Dilip Kumar, Andres Freund Discussion: https://postgr.es/m/CAD21AoDht9Pz_DFv_R2LqBTBbO4eGrpa9Vojmt5z5sEx3XwD7A@mail.gmail.com
* Remove unneeded volatile qualifier in fmgr.c.Nathan Bossart2025-02-12
| | | | | | | | | | | | Currently, the save_nestlevel variable in fmgr_security_definer() is marked volatile. While this may have been necessary when it was used in a PG_CATCH section (as explained in the comment for PG_TRY in elog.h), it appears to have been unnecessary since commit 82a47982f3, which removed its use in a PG_CATCH section. Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z6xbAgXKY2L-3d5Q%40jrouhaud
* Clean up impenetrable logic in pg_basebackup/receivelog.c.Tom Lane2025-02-12
| | | | | | | | | | | | | | | | Coverity complained about possible double free of HandleCopyStream's "copybuf". AFAICS it's mistaken, but it is easy to see why it's confused, because management of that buffer is impossibly confusing. It's unreasonable that HandleEndOfCopyStream frees the buffer in some cases but not others, updates the caller's state for that in no case, and has not a single comment about how complicated that makes things. Let's put all the responsibility for freeing copybuf in the actual owner of that variable, HandleCopyStream. This results in one more PQfreemem call than before, but the logic is far easier to follow, both for humans and machines. Since this isn't (quite) actually broken, no back-patch.
* Fix minor memory leaks in pg_dump.Tom Lane2025-02-12
| | | | | | | | | Coverity reported the two oversights in getPublicationTables. Valgrind found the one in determineNotNullFlags. The mistakes in getPublicationTables seem too minor to be worth back-patching. determineNotNullFlags could be run enough times to matter, but that code is new in v18. So, no back-patch.
* ci: Collect core files on NetBSD and OpenBSDAndres Freund2025-02-12
| | | | | | | | | Support for NetBSD and OpenBSD operating systems have been added to CI in the prior commit. Now add support for collect core files and generating backtraces using for all core files. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ32ySyYa06k9MFd+VY5vHhUyBpvgmJUZae5PihjzaurVg@mail.gmail.com
* ci: Test NetBSD and OpenBSDAndres Freund2025-02-12
| | | | | | | | | | | | NetBSD and OpenBSD Postgres CI images are now generated [1], but aren't yet utilized for Postgres' CI. This commit adds CI support for them. For now the tasks will be manually triggered, to save on CI credits. [1] https://github.com/anarazel/pg-vm-images Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ32ySyYa06k9MFd+VY5vHhUyBpvgmJUZae5PihjzaurVg@mail.gmail.com
* Fix issue in recovery test 041_checkpoint_at_promoteMichael Paquier2025-02-12
| | | | | | | | | | | | | | | | | | The phase of the test waiting for a restartpoint to complete was not working as intended, due to a log_contains() call incorrectly written. The problem reported by the author could be simply reproduced by removing the injection_points_wakeup() call: the test succeeds rather than waiting for the restartpoint completion. In most cases, the restartpoint completion is fast enough that the test offered the wanted coverage. On slow machines, it could have become unreliable. Oversight in 6782709df81f. Author: Nitin Jadhav Discussion: https://postgr.es/m/CAMm1aWa_6u+o52r7h7G6pX-oWD0Qraf0ee17Ma50qxGS0B_Rzg@mail.gmail.com Backpatch-through: 17
* Fix some inconsistencies with memory freeing in pg_createsubscriberMichael Paquier2025-02-12
| | | | | | | | | | | | | | | | The correct function documented to free the memory allocated for the result returned by PQescapeIdentifier() and PQescapeLiteral() is PQfreemem(). pg_createsubscriber.c relied on pg_free() instead, which is not incorrect as both do a free() internally, but inconsistent with the documentation. While on it, this commit fixes a small memory leak introduced by 4867f8a555ce, as the code of pg_createsubscriber makes this effort. Author: Ranier Vilela Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/CAEudQAp=AW5dJXrGLbC_aZg_9nOo=42W7uLDRONFQE-gcgnkgQ@mail.gmail.com Backpatch-through: 17
* Remove unnecessary (char *) casts [checksum]Peter Eisentraut2025-02-12
| | | | | | | | | | | Remove some (char *) casts related to uses of the pg_checksum_page() function. These casts are useless, because everything involved already has the right type. Moreover, these casts actually silently discarded a const qualifier. The declaration of a higher-level function needs to be adjusted to fix that. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* Remove unnecessary (char *) casts [mem]Peter Eisentraut2025-02-12
| | | | | | | | | | Remove (char *) casts around memory functions such as memcmp(), memcpy(), or memset() where the cast is useless. Since these functions don't take char * arguments anyway, these casts are at best complicated casts to (void *), about which see commit 7f798aca1d5. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* Remove unnecessary (char *) casts [string]Peter Eisentraut2025-02-12
| | | | | | | | | | Remove (char *) casts around string functions where the arguments or result already have the right type and the cast is useless (or worse, potentially casts away a qualifier, but this doesn't appear to be the case here). Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* Add cost-based vacuum delay time to progress views.Nathan Bossart2025-02-11
| | | | | | | | | | | | | | | | | | | | | | | This commit adds the amount of time spent sleeping due to cost-based delay to the pg_stat_progress_vacuum and pg_stat_progress_analyze system views. A new configuration parameter named track_cost_delay_timing, which is off by default, controls whether this information is gathered. For vacuum, the reported value includes the sleep time of any associated parallel workers. However, parallel workers only report their sleep time once per second to avoid overloading the leader process. Bumps catversion. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Sergei Kornilov <sk@zsrv.org> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
* Add is_analyze parameter to vacuum_delay_point().Nathan Bossart2025-02-11
| | | | | | | | | | | | This function is used in both vacuum and analyze code paths, and a follow-up commit will require distinguishing between the two. This commit forces callers to specify whether they are in a vacuum or analyze path, but it does not use that information for anything yet. Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
* Limit pgbench COPY FREEZE to ordinary relationsMelanie Plageman2025-02-11
| | | | | | | | | | | | | | | | | | pgbench client-side data generation uses COPY FREEZE to load data for most tables. COPY FREEZE isn't supported for partitioned tables and since pgbench only supports partitioning pgbench_accounts, pgbench used a hard-coded check to skip COPY FREEZE and use plain COPY for a partitioned pgbench_accounts. If the user has manually partitioned one of the other pgbench tables, this causes client-side data generation to error out with: ERROR: cannot perform COPY FREEZE on a partitioned table Fix this by limiting COPY FREEZE to ordinary tables (RELKIND_RELATION). Author: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/flat/97f55fca-8a7b-4da8-b413-7d1c57010676%40postgrespro.ru
* Injection points for hash aggregation.Jeff Davis2025-02-11
| | | | | | | | Requires adding a guard against shift-by-32. Previously, that was impossible because the number of partitions was always greater than 1, but a new injection point can force the number of partitions to 1. Discussion: https://postgr.es/m/ff4e59305e5d689e03cd256a736348d3e7958f8f.camel@j-davis.com
* Eagerly scan all-visible pages to amortize aggressive vacuumMelanie Plageman2025-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aggressive vacuums must scan every unfrozen tuple in order to advance the relfrozenxid/relminmxid. Because data is often vacuumed before it is old enough to require freezing, relations may build up a large backlog of pages that are set all-visible but not all-frozen in the visibility map. When an aggressive vacuum is triggered, all of these pages must be scanned. These pages have often been evicted from shared buffers and even from the kernel buffer cache. Thus, aggressive vacuums often incur large amounts of extra I/O at the expense of foreground workloads. To amortize the cost of aggressive vacuums, eagerly scan some all-visible but not all-frozen pages during normal vacuums. All-visible pages that are eagerly scanned and set all-frozen in the visibility map are counted as successful eager freezes and those not frozen are counted as failed eager freezes. If too many eager scans fail in a row, eager scanning is temporarily suspended until a later portion of the relation. The number of failures tolerated is configurable globally and per table. To effectively amortize aggressive vacuums, we cap the number of successes as well. Capping eager freeze successes also limits the amount of potentially wasted work if these pages are modified again before the next aggressive vacuum. Once we reach the maximum number of blocks successfully eager frozen, eager scanning is disabled for the remainder of the vacuum of the relation. Original design idea from Robert Haas, with enhancements from Andres Freund, Tomas Vondra, and me Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com