aboutsummaryrefslogtreecommitdiff
path: root/contrib
Commit message (Collapse)AuthorAge
* Make query jumbling also squash PARAM_EXTERN paramsÁlvaro Herrera2025-06-24
| | | | | | | | | | | | | | | | | | | | | | Commit 62d712ecfd94 made query jumbling squash lists of Consts as a single element, but there's no reason not to treat PARAM_EXTERN parameters the same. For these purposes, these values are indeed constants for any particular execution of a query. In particular, this should make list squashing more useful for applications using extended query protocol, which would use parameters extensively. A complication arises: if a query has both external parameters and squashable lists, then the parameter number used as placeholder for the squashed list might be inconsistent with regards to the parameter numbers used by the query literal. To reduce the surprise factor, all parameters are renumbered starting from 1 in that case. Author: Sami Imseih <samimseih@gmail.com> Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
* Improve jumble squashing through CoerceViaIO and RelabelTypeÁlvaro Herrera2025-06-24
| | | | | | There's no principled reason for query jumbling to only remove the first layer of RelabelType and CoerceViaIO. Change it to see through as many layers as there are.
* amcheck: Fix posting tree checks in gin_index_check()Tomas Vondra2025-06-17
| | | | | | | | | | | | | | | | | | | Fix two issues in parent_key validation in posting trees: * It's not enough to check stack->parentblk is valid to determine if the parentkey is valid. It's possible parentblk is set to a valid block number, but parentkey is invalid. So check parentkey directly. * We don't need to invalidate parentkey for all child pages of the rightmost page. It's enough to invalidate it for the rightmost child only, which means we can check more cases (less false negatives). Issues reported by Arseniy Mukhin, along with a proposed patch. Review by Andrey M. Borodin, cleanup and improvements by me. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
* amcheck: Fix parent key check in gin_index_check()Tomas Vondra2025-06-17
| | | | | | | | | | | | | | | | | | | | | | | The checks introduced by commit 14ffaece0fb5 did not get the parent key checks quite right, missing some data corruption cases. In particular: * The "rightlink" check was not working as intended, because rightlink is a BlockNumber, and InvalidBlockNumber is 0xFFFFFFFF, so !GinPageGetOpaque(page)->rightlink almost always evaluates to false (except for rightlink=0). So in most cases parenttup was left NULL, preventing any checks against parent. * Use GinGetDownlink() to retrieve child blkno to avoid triggering Assert, same as the core GIN code. Issues reported by Arseniy Mukhin, along with a proposed patch. Review by Andrey M. Borodin, cleanup and improvements by me. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
* amcheck: Fix checks of entry order for GIN indexesTomas Vondra2025-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tightens a couple checks in checking GIN indexes, which might have resulted in incorrect results (false positives/negatives). * The code skipped ordering checks if the entries were for different attributes (for multi-column GIN indexes), possibly missing some cases of data corruption. But the attribute number is part of the ordering, so we can check that. * The root page was skipped when checking entry order, but that is unnecessary. The root page is subject to the same ordering rules, we can process it just like any other page. * The high key on the right-most page was not checked, but that is needed only for inner pages (we don't store the high key for those). For leaf pages we can check the high key just fine. * Correct the detection of split pages. If the page gets split, the cached parent key is greater than the current child key (not less, as the code incorrectly expected). Issues reported by Arseniy Mukhin, along with a proposed patch. Review by Andrey M. Borodin, cleanup and improvements by me. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
* amcheck: Remove unused GinScanItem->parentlsn fieldTomas Vondra2025-06-17
| | | | | | | | | | | | The field was introduced by commit 14ffaece0fb5, but is unused and unnecessary. So remove it. Issues reported by Arseniy Mukhin, along with a proposed patch. Review by Andrey M. Borodin, cleanup and minor improvements by me. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
* amcheck: Test gin_index_check on a multicolumn indexTomas Vondra2025-06-17
| | | | | | | | | | Adds a regression test with gin_index_check() on a multicolumn index, to verify it's handled correctly and improve test coverage for code introduced by 14ffaece0fb5. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
* Fix re-distributing previously distributed invalidation messages during ↵Masahiko Sawada2025-06-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | logical decoding. Commit 4909b38af0 introduced logic to distribute invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. However, since each transaction distributes not only its original invalidation messages but also previously distributed messages to other transactions, this leads to an exponential increase in allocation request size for invalidation messages, ultimately causing memory allocation failure. This commit fixes this issue by tracking distributed invalidation messages separately per decoded transaction and not redistributing these messages to other in-progress transactions. The maximum size of distributed invalidation messages that one transaction can store is limited to MAX_DISTR_INVAL_MSG_PER_TXN (8MB). Once the size of the distributed invalidation messages exceeds this threshold, we invalidate all caches in locations where distributed invalidation messages need to be executed. Back-patch to all supported versions where we introduced the fix by commit 4909b38af0. Note that this commit adds two new fields to ReorderBufferTXN to store the distributed transactions. This change breaks ABI compatibility in back branches, affecting third-party extensions that depend on the size of the ReorderBufferTXN struct, though this scenario seems unlikely. Additionally, it adds a new flag to the txn_flags field of ReorderBufferTXN to indicate distributed invalidation message overflow. This should not affect existing implementations, as it is unlikely that third-party extensions use unused bits in the txn_flags field. Bug: #18938 #18942 Author: vignesh C <vignesh21@gmail.com> Reported-by: Duncan Sands <duncan.sands@deepbluecap.com> Reported-by: John Hutchins <john.hutchins@wicourts.gov> Reported-by: Laurence Parry <greenreaper@hotmail.com> Reported-by: Max Madden <maxmmadden@gmail.com> Reported-by: Braulio Fdo Gonzalez <brauliofg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/680bdaf6-f7d1-4536-b580-05c2760c67c6@deepbluecap.com Discussion: https://postgr.es/m/18942-0ab1e5ae156613ad@postgresql.org Discussion: https://postgr.es/m/18938-57c9a1c463b68ce0@postgresql.org Discussion: https://postgr.es/m/CAD1FGCT2sYrP_70RTuo56QTizyc+J3wJdtn2gtO3VttQFpdMZg@mail.gmail.com Discussion: https://postgr.es/m/CANO2=B=2BT1hSYCE=nuuTnVTnjidMg0+-FfnRnqM6kd23qoygg@mail.gmail.com Backpatch-through: 13
* Fix squashing algorithm for query textsÁlvaro Herrera2025-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | The algorithm to squash lists of constants added by commit 62d712ecfd94 was a bit too simplistic; we wanted to avoid adding unnecessary complexity, but cases like direct function calls of typecasting functions (and others) were missed, and bogus SQL syntax was being shown in pg_stat_statements normalized query text field. To fix normalization for those cases, we need the parser to transmit information about were each list of constant values starts and ends, so add that to a couple of nodes. Also add a few more test cases to make sure we're doing the right thing. The patch initially submitted by Sami added a new private struct in gram.y to carry the start/end information for A_Expr, but I (Álvaro) decided that a better fix was to remove the parser indirection via the in_expr production, and instead create separate components in the a_expr rule. I'm surprised that this works and doesn't require more changes, but I assume (without checking) that the grammar used to be more complex and got simplified at some point. Bump catversion. Author: Sami Imseih <samimseih@gmail.com> Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
* Revert support for improved tracking of nested queriesMichael Paquier2025-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit reverts the two following commits: - 499edb09741b, track more precisely query locations for nested statements. - 06450c7b8c70, a follow-up fix of 499edb09741b with query locations. The test introduced in this commit is not reverted. This is proving useful to track a problem that only pgaudit was able to detect. These prove to have issues with the tracking of SELECT statements, when these use multiple parenthesis which is something supported by the grammar. Incorrect location and lengths are causing pg_stat_statements to become confused, failing its job in query normalization with potential out-of-bound writes because the location and the length may not match with what can be handled. A lot of the query patterns discussed when this issue was reported have no test coverage in the main regression test suite, or the recovery test 027_stream_regress.pl would have caught the problems as pg_stat_statements is loaded by the node running the regression tests. A first step would be to improve the test coverage to stress more the query normalization logic. A different portion of this work was done in 45e0ba30fc40, with the addition of tests for nested queries. These can be left in the tree. They are useful to track the way inner queries are currently tracked by PGSS with non-top-level entries, and will be useful when reconsidering in the future the work reverted here. Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru> Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
* Revert a few small patches that were intended for version 19.Jeff Davis2025-06-11
| | | | | | | | | | | - 4c787a24e7e220a60022e47c1776f22f72902899 - 78bd364ee39ca70a8f9cb8719282389866a08e14 - 7a6880fadc177873d5663961ec3a02d67e34dcbe - 8898082a5d3e94eef073f0e08124137e096e78ef Suggested-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+TgmoZ=J=PVNZUNKaxULu+KUVSt3Y-aJ1DZ9Y3Co6mu0z62jA@mail.gmail.com Discussion: https://postgr.es/m/60e8c6d0a6c08e67f15dbbe9e53df0119c710065.camel@j-davis.com
* isn.c: use pg_ascii_toupper() instead of toupper().Jeff Davis2025-06-10
| | | | | | | Avoid dependence on setlocale(). No behavior change. Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
* contrib/spi/refint.c: use pg_ascii_tolower() instead.Jeff Davis2025-06-10
| | | | | | | Avoid dependence on setlocale(). No behavior change. Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
* Revert "postgres_fdw: Inherit the local transaction's access/deferrable modes."Etsuro Fujita2025-06-08
| | | | | | | | | | We concluded that commit e5a3c9d9b is a feature rather than a fix; since it was added after feature freeze, revert it. Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/ed2296f1-1a6b-4932-b870-5bb18c2591ae%40oss.nttdata.com
* pg_prewarm: Allow autoprewarm to use more than 1GB to dump blocks.Robert Haas2025-06-06
| | | | | | | Reported-by: Daria Shanina <vilensipkdm@gmail.com> Author: Daria Shanina <vilensipkdm@gmail.com> Author: Robert Haas <robertmhaas@gmail.com> Backpatch-through: 13
* Disallow "=" in names of reloptions and foreign-data options.Tom Lane2025-06-02
| | | | | | | | | | | | | | | | | | | | We store values for these options as array elements with the syntax "name=value", hence a name containing "=" confuses matters when it's time to read the array back in. Since validation of the options is often done (long) after this conversion to array format, that leads to confusing and off-point error messages. We can improve matters by rejecting names containing "=" up-front. (Probably a better design would have involved pairs of array elements, but it's too late now --- and anyway, there's no evident use-case for option names like this. We already reject such names in some other contexts such as GUCs.) Reported-by: Chapman Flack <jcflack@acm.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chapman Flack <jcflack@acm.org> Discussion: https://postgr.es/m/6830EB30.8090904@acm.org Backpatch-through: 13
* Rename gist stratnum support functionPeter Eisentraut2025-06-02
| | | | | | | | | | | | | | | | | | | | | | | | Commit 7406ab623fe added a gist support function that we internally refer to by the symbol GIST_STRATNUM_PROC. This translated from "well-known" strategy numbers to opfamily-specific strategy numbers. However, we later (commit 630f9a43cec) changed this to fit into index-AM-level compare type mapping, so this function actually now maps from compare type to opfamily-specific strategy numbers. So this name is no longer fitting. Moreover, the index AM level also supports the opposite, a function to map from strategy number to compare type. This is currently not supported in gist, but one might wonder what this function is supposed to be called when it is added. This patch changes the naming of the gist-level functionality to be more in line with the index-AM-level functionality. This makes sense because these are essentially the same thing on different levels. This also changes the names of the externally visible functions that are provided for use as such a support function. Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/37ebb1d9-9036-485f-a215-e55435689917%40eisentraut.org
* postgres_fdw: Inherit the local transaction's access/deferrable modes.Etsuro Fujita2025-06-01
| | | | | | | | | | | | | | | | | | | | | | | | | Previously, postgres_fdw always 1) opened a remote transaction in READ WRITE mode even when the local transaction was READ ONLY, causing a READ ONLY transaction using it that references a foreign table mapped to a remote view executing a volatile function to write in the remote side, and 2) opened the remote transaction in NOT DEFERRABLE mode even when the local transaction was DEFERRABLE, causing a SERIALIZABLE READ ONLY DEFERRABLE transaction using it to abort due to a serialization failure in the remote side. To avoid these, modify postgres_fdw to open a remote transaction in the same access/deferrable modes as the local transaction. This commit also modifies it to open a remote subtransaction in the same access mode as the local subtransaction. Although these issues exist since the introduction of postgres_fdw, there have been no reports from the field. So it seems fine to just fix them in master only. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com
* Fix memory leakage in postgres_fdw's DirectModify code path.Tom Lane2025-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw tries to use PG_TRY blocks to ensure that it will eventually free the PGresult created by the remote modify command. However, it's fundamentally impossible for this scheme to work reliably when there's RETURNING data, because the query could fail in between invocations of postgres_fdw's DirectModify methods. There is at least one instance of exactly this situation in the regression tests, and the ensuing session-lifespan leak is visible under Valgrind. We can improve matters by using a memory context reset callback attached to the ExecutorState context. That ensures that the PGresult will be freed when the ExecutorState context is torn down, even if control never reaches postgresEndDirectModify. I have little faith that there aren't other potential PGresult leakages in the backend modules that use libpq. So I think it'd be a good idea to apply this concept universally by creating infrastructure that attaches a reset callback to every PGresult generated in the backend. However, that seems too invasive for v18 at this point, let alone the back branches. So for the moment, apply this narrow fix that just makes DirectModify safe. I have a patch in the queue for the more general idea, but it will have to wait for v19. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us Backpatch-through: 13
* Change internal queryid type from uint64 to int64David Rowley2025-05-30
| | | | | | | | | | | | | | | | | | | | | | uint64 was perhaps chosen in cff440d36 as the type was uint32 prior to that widening work. Having this as uint64 doesn't make much sense and just adds the overhead of having to remember that we always output this in its signed form. Let's remove that overhead. The signed form output is seemingly required since we have no way to represent the full range of uint64 in an SQL type. We use BIGINT in places like pg_stat_statements, which maps directly to int64. The release notes "Source Code" section may want to mention this adjustment as some extensions may wish to adjust their code. Author: David Rowley <dgrowleyml@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/50cb0c8b-994b-48f9-a1c4-13039eb3536b@eisentraut.org
* Avoid resource leaks when a dblink connection fails.Tom Lane2025-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If we hit out-of-memory between creating the PGconn and inserting it into dblink's hashtable, we'd lose track of the PGconn, which is quite bad since it represents a live connection to a remote DB. Fix by rearranging things so that we create the hashtable entry first. Also reduce the number of states we have to deal with by getting rid of the separately-allocated remoteConn object, instead allocating it in-line in the hashtable entries. (That incidentally removes a session-lifespan memory leak observed in the regression tests.) There is an apparently-irreducible remaining OOM hazard, which is that if the connection fails at the libpq level (ie it's CONNECTION_BAD) then we have to pstrdup the PGconn's error message before we can release it, and theoretically that could fail. However, in such cases we're only leaking memory not a live remote connection, so I'm not convinced that it's worth sweating over. This is a pretty low-probability failure mode of course, but losing a live connection seems bad enough to justify back-patching. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/1346940.1748381911@sss.pgh.pa.us Backpatch-through: 13
* Fix assertion failure in pg_prewarm() on objects without storage.Fujii Masao2025-05-29
| | | | | | | | | | | | | | | | | | | | | | | | An assertion test added in commit 049ef33 could fail when pg_prewarm() was called on objects without storage, such as partitioned tables. This resulted in the following failure in assert-enabled builds: Failed Assert("RelFileNumberIsValid(rlocator.relNumber)") Note that, in non-assert builds, pg_prewarm() just failed with an error in that case, so there was no ill effect in practice. This commit fixes the issue by having pg_prewarm() raise an error early if the specified object has no storage. This approach is similar to the fix in commit 4623d7144 for pg_freespacemap. Back-patched to v17, where the issue was introduced. Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/e082e6027610fd0a4091ae6d033aa117@oss.nttdata.com Backpatch-through: 17
* pg_stat_statements: Fix parameter number gaps in normalized queriesMichael Paquier2025-05-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_stat_statements anticipates that certain constant locations may be recorded multiple times and attempts to avoid calculating a length for these locations in fill_in_constant_lengths(). However, during generate_normalized_query() where normalized query strings are generated, these locations are not excluded from consideration. This could increment the parameter number counter for every recorded occurrence at such a location, leading to an incorrect normalization in certain cases with gaps in the numbers reported. For example, take this query: SELECT WHERE '1' IN ('2'::int, '3'::int::text) Before this commit, it would be normalized like that, with gaps in the parameter numbers: SELECT WHERE $1 IN ($3::int, $4::int::text) However the correct, less confusing one should be like that: SELECT WHERE $1 IN ($2::int, $3::int::text) This commit fixes the computation of the parameter numbers to track the number of constants replaced with an $n by a separate counter instead of the iterator used to loop through the list of locations. The underlying query IDs are not changed, neither are the normalized strings for existing PGSS hash entries. New entries with fresh normalized queries would automatically get reshaped based on the new parameter numbering. Issue discovered while discussing a separate problem for HEAD, but this affects all the stable branches. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com Backpatch-through: 13
* Revert "Don't lock partitions pruned by initial pruning"Amit Langote2025-05-22
| | | | | | | | | | | | | | | | As pointed out by Tom Lane, the patch introduced fragile and invasive design around plan invalidation handling when locking of prunable partitions was deferred from plancache.c to the executor. In particular, it violated assumptions about CachedPlan immutability and altered executor APIs in ways that are difficult to justify given the added complexity and overhead. This also removes the firstResultRels field added to PlannedStmt in commit 28317de72, which was intended to support deferred locking of certain ModifyTable result relations. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us
* Fix regression with location calculation of nested statementsMichael Paquier2025-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The statement location calculated for some nested query cases was wrong when multiple queries are sent as a single string, these being separated by semicolons. As pointed by Sami Imseih, the location calculation was incorrect when the last query of nested statement with multiple queries does **NOT** finish with a semicolon for the last statement. In this case, the statement length tracked by RawStmt is 0, which is equivalent to say that the string should be used until its end. The code previously discarded this case entirely, causing the location to remain at 0, the same as pointing at the beginning of the string. This caused pg_stat_statements to store incorrect query strings. This issue has been introduced in 499edb09741b. I have looked at the diffs generated by pgaudit back then, and noticed the difference generated for this nested query case, but I have missed the point that it was an actual regression with an existing case. A test case is added in pg_stat_statements to provide some coverage, restoring the pre-17 behavior for the calculation of the query locations. Special thanks to David Steele, who, through an analysis of the test diffs generated by pgaudit with the new v18 logic, has poked me about the fact that my original analysis of the matter was wrong. The test output of pg_overexplain is updated to reflect the new logic, as the new locations refer to the beginning of the argument passed to the function explain_filter(). When the module was introduced in 8d5ceb113e3f, which was after 499edb09741b (for the new calculation method), the locations of the test were not actually right: the plan generated for the query string given in input of the function pointed to the top-level query, not the nested one. Reported-by: David Steele <david@pgbackrest.org> Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/844a3b38-bbf1-4fb2-9fd6-f58c35c09917@pgbackrest.org
* Fix incorrect year in some copyright noticesMichael Paquier2025-05-19
| | | | | | | | | A couple of new files have been added in the tree with a copyright year of 2024 while we were already in 2025. These should be marked with 2025, so let's fix them. Reported-by: Shaik Mohammad Mujeeb <mujeeb.sk.dev@gmail.com> Discussion: https://postgr.es/m/CALa6HA4_Wu7-2PV0xv-Q84cT8eG7rTx6bdjUV0Pc=McAwkNMfQ@mail.gmail.com
* Use 'void *' for arbitrary buffers, 'uint8 *' for byte arraysHeikki Linnakangas2025-05-08
| | | | | | | | | | | | | A 'void *' argument suggests that the caller might pass an arbitrary struct, which is appropriate for functions like libc's read/write, or pq_sendbytes(). 'uint8 *' is more appropriate for byte arrays that have no structure, like the cancellation keys or SCRAM tokens. Some places used 'char *', but 'uint8 *' is better because 'char *' is commonly used for null-terminated strings. Change code around SCRAM, MD5 authentication, and cancellation key handling to follow these conventions. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
* Suppress unnecessary explicit sorting for EPQ mergejoin pathRichard Guo2025-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building a ForeignPath for a joinrel, if there's a possibility that EvalPlanQual will be executed, we must identify a suitable path for EPQ checks. If the outer or inner path of the chosen path is a ForeignPath representing a pushed-down join, we replace it with its fdw_outerpath to ensure that the EPQ check path consists entirely of local joins. If the chosen path is a MergePath, and its outer or inner path is a ForeignPath that is not already well enough ordered, the MergePath will have non-NIL outersortkeys or innersortkeys indicating the desired ordering to be created by an explicit Sort node. If we then replace the outer or inner path with its corresponding fdw_outerpath, and that path is already sufficiently ordered, we end up in an inconsistent state: the MergePath has non-NIL outersortkeys or innersortkeys, and its input path is already properly ordered. This inconsistency can result in an Assert failure or the addition of a redundant Sort node. To fix, check if the new outer or inner path of a MergePath is already properly sorted, and set its outersortkeys or innersortkeys to NIL if so. Bug: #18902 Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18902-71c1bed2b9f7c46f@postgresql.org
* Fix whitespacePeter Eisentraut2025-05-07
|
* oauth: Disallow OAuth connections via postgres_fdw/dblinkJacob Champion2025-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | A subsequent commit will reclassify oauth_client_secret from dispchar="" to dispchar="*", so that UIs will treat it like a secret. For our FDWs, this change will move that option from SERVER to USER MAPPING, which we need to avoid. But upon further discussion, we don't really want our FDWs to use our builtin Device Authorization flow at all, for several reasons: - the URL and code would be printed to the server logs, not sent over the client connection - tokens are not cached/refreshed, so every single connection has to be manually authorized by a user with a browser - oauth_client_secret needs to belong to the foreign server, but options on SERVER are publicly accessible - all non-superusers would need password_required=false, which is dangerous Future OAuth work can use FDWs as a motivating use case. But for now, disallow all oauth_* connection options for these two extensions. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250415191435.55.nmisch%40google.com
* Add maintenance_io_concurrency flag to some read stream usersMelanie Plageman2025-04-28
| | | | | | | | | | | Index vacuuming and [auto]prewarm AIO concurrency should be governed by maintenance_io_concurrency. As such, pass those read stream users the READ_STREAM_MAINTENANCE flag which will calculate their read stream distance with maintenance_io_concurrency instead of effective_io_concurrency. This was an oversight in the original commits making those operations use the read stream API. Discussion: https://postgr.es/m/flat/CAAKRu_aopDxTo4b41Mt_7Zc-z0_ngocrY8SFCCY6Aph1HgwuNw%40mail.gmail.com
* Fix xmin advancement during fast_forward decoding.Amit Kapila2025-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During logical decoding, we advance catalog_xmin of logical too early in fast_forward mode, resulting in required catalog data being removed by vacuum. This mode is normally used to advance the slot without processing the changes, but we still can't let the slot's xmin to advance to an incorrect value. Commit f49a80c481 fixed a similar issue where the logical slot's catalog_xmin was getting advanced prematurely during non-fast-forward mode. During xl_running_xacts processing, instead of directly advancing the slot's xmin to the oldest running xid in the record, it allowed the xmin to be held back for snapshots that can be used for not-yet-replayed transactions, as those might consider older txns as running too. However, it missed the fact that the same problem can happen during fast_forward mode decoding, as we won't build a base snapshot in that mode, and the future call to get_changes from the same slot can miss seeing the required catalog changes leading to incorrect reslts. This commit allows building the base snapshot even in fast_forward mode to prevent the early advancement of xmin. Reported-by: Amit Kapila <amit.kapila16@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Remove circular #include's between plpython.h and plpy_util.h.Tom Lane2025-04-27
| | | | | | | | | | | | | | | | | | | | | plpython.h included plpy_util.h, simply on the grounds that "it's easier to just include it everywhere". However, plpy_util.h must include plpython.h, or it won't pass headerscheck. While the resulting circularity doesn't have any immediate bad effect, it's poor design. We have seen serious messes arise in the past from overly-broad inclusion footprints created by such circularities, so let's establish a project policy against it. To fix, just replace *.c files' inclusions of plpython.h with plpy_util.h. They'll pull in plpython.h indirectly; indeed, almost all have already done so via inclusions of other plpy_xxx.h headers. (Any extensions using plpython.h can do likewise without breaking the compatibility of their code with prior Postgres versions.) Reported-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aAxQ6fcY5QQV1lo3@ip-10-97-1-34.eu-west-3.compute.internal
* Fix typo in test file name added in commit 4909b38af0.Amit Kapila2025-04-25
| | | | | | Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CANhcyEXsObdjkjxEnq10aJumDpa5J6aiPzgTh_w4KCWRYHLw6Q@mail.gmail.com
* Fix a few duplicate words in commentsDavid Rowley2025-04-21
| | | | | | | These are all new to v18 Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
* Be more wary of corrupt data in pageinspect's heap_page_items().Tom Lane2025-04-19
| | | | | | | | | | | | | | | | | The original intent in heap_page_items() was to return nulls, not throw an error or crash, if an item was sufficiently corrupt that we couldn't safely extract data from it. However, commit d6061f83a utterly missed that memo, and not only put in an un-length-checked copy of the tuple's data section, but also managed to break the check on sane nulls-bitmap length. Either mistake could possibly lead to a SIGSEGV crash if the tuple is corrupt. Bug: #18896 Reported-by: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18896-add267b8e06663e3@postgresql.org Backpatch-through: 13
* Fix typos and grammar in the codeMichael Paquier2025-04-19
| | | | | | | | The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
* Fix incorrect format placeholdersPeter Eisentraut2025-04-14
| | | | BlockNumber is unsigned int. Fix for commit 14ffaece0fb.
* Fix GIN's shimTriConsistentFn to not corrupt its input.Tom Lane2025-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0f21db36d made an assumption that GIN triConsistentFns would not modify their input entryRes[] arrays. But in fact, the "shim" triConsistentFn that we use for opclasses that don't supply their own did exactly that, potentially leading to wrong answers from a GIN index search. Through bad luck, none of the test cases that we have for such opclasses exposed the bug. One response to this could be that the assumption of consistency check functions not modifying entryRes[] arrays is a bad one, but it still seems reasonable to me. Notably, shimTriConsistentFn is itself assuming that with respect to the underlying boolean consistentFn, so it's sure being self-centered in supposing that it gets to do so. Fortunately, it's quite simple to fix shimTriConsistentFn to restore the entry-time state of entryRes[], so let's do that instead. This issue doesn't affect any core GIN opclasses, since they all supply their own triConsistentFns. It does affect contrib modules btree_gin, hstore, and intarray. Along the way, I (tgl) noticed that shimTriConsistentFn failed to pick up on a "recheck" flag returned by its first call to the boolean consistentFn. This may be only a latent problem, since it would be unlikely for a consistentFn to set recheck for the all-false case and not any other cases. (Indeed, none of our contrib modules do that.) Nonetheless, it's formally wrong. Reported-by: Vinod Sridharan <vsridh90@gmail.com> Author: Vinod Sridharan <vsridh90@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFMdLD7XzsXfi1+DpTqTgrD8XU0i2C99KuF=5VHLWjx4C1pkcg@mail.gmail.com Backpatch-through: 13
* Harmonize function parameter names for Postgres 18.Peter Geoghegan2025-04-12
| | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit 035ce1fe).
* Improve various new-to-v18 appendStringInfo callsDavid Rowley2025-04-11
| | | | | | | | | Similar to 8461424fd, here we adjust a few new locations which were not using the most suitable appendStringInfo* function for the intended purpose. Author: David Rowley <drowleyml@gmail.com Discussion: https://postgr.es/m/CAApHDvqJnNjueb=Eoj8K+8n0g7nj_AcPWSiCj5RNV4fDejAfqA@mail.gmail.com
* Fix data loss in logical replication.Amit Kapila2025-04-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ... that don't take a strong lock on table happens concurrently to DMLs on the tables involved in the DDL. This happens because logical decoding doesn't distribute invalidations to concurrent transactions and those transactions use stale cache data to decode the changes. The problem becomes bigger because we keep using the stale cache even after those in-progress transactions are finished and skip the changes required to be sent to the client. This commit fixes the issue by distributing invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. This allows the necessary rebuild of the catalog cache when decoding new changes after concurrent DDL. We observed performance regression primarily during frequent execution of *publication DDL* statements that modify the published tables. The regression is minor or nearly nonexistent for DDLs that do not affect the published tables or occur infrequently, making this a worthwhile cost to resolve a longstanding data loss issue. An alternative approach considered was to take a strong lock on each affected table during publication modification. However, this would only address issues related to publication DDLs (but not the ALTER TYPE ...) and require locking every relation in the database for publications created as FOR ALL TABLES, which is impractical. The bug exists in all supported branches, but we are backpatching till 14. The fix for 13 requires somewhat bigger changes than this fix, so the fix for that branch is still under discussion. Reported-by: hubert depesz lubaczewski <depesz@depesz.com> Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Tested-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Backpatch-through: 14 Discussion: https://postgr.es/m/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
* Cleanup of pg_numa.cTomas Vondra2025-04-09
| | | | | | | | | | | | | | | | | | | | | | | | | This moves/renames some of the functions defined in pg_numa.c: * pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and moved to src/backend/storage/ipc/shmem.c. The new name better reflects that the page size is not related to NUMA, and it's specifically about the page size used for the main shared memory segment. * move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into the backend (which more appropriate for functions callable from SQL). While at it, improve the comment to explain what page size it returns. * remove unnecessary includes from src/port/pg_numa.c, adding unnecessary dependencies (src/port should be suitable for frontent). These were either leftovers or unnecessary thanks to the other changes in this commit. This eliminates unnecessary dependencies on backend symbols, which we don't want in src/port. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com
* Move contrib/spi testing from core regression tests to contrib/spi.Tom Lane2025-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's weird to have the core regression tests depending on contrib code, and coverage testing shows that those test queries add nothing to the core-code coverage of the core tests. So pull those test bits out and put them into ordinary test scripts inside contrib/spi/, making that more like other contrib modules. Aside from being structurally nicer, anything we can take out of the core tests (which are executed multiple times per check-world run) and put into tests executed only once should be a win. It doesn't look like this change will buy a whole lot of milliseconds, but a cycle saved is a cycle earned. Also, there is some discussion around possibly removing refint and/or autoinc altogether. I don't know if that will happen, but we'd certainly need to decouple them from the core tests to do so. The tests for autoinc were quite intertwined with the undocumented "ttdummy" trigger in regress.c. That made the tests very hard to understand and contributed nothing to autoinc's testing either. So I just deleted ttdummy and rewrote the autoinc tests without it. I realized while doing this that the description of autoinc in the SGML docs is not a great description of what the function actually does, so the patch includes some updates to those docs. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/3872677.1744077559@sss.pgh.pa.us
* Fix incorrect format placeholderPeter Eisentraut2025-04-08
| | | | for commit 749a9e20c97
* pg_buffercache: Change page_num type to bigintTomas Vondra2025-04-08
| | | | | | | | | | | The page_num was defined as integer, which should be sufficient for the near future (with 4K pages it's 8TB). But it's virtually free to return bigint, and get a wider range. This was agreed on the thread, but I forgot to tweak this in ba2a3c2302f1. While at it, make the data types in CREATE VIEW a bit more consistent. Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.co
* Add pg_buffercache_evict_{relation,all} functionsAndres Freund2025-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In addition to the added functions, the pg_buffercache_evict() function now shows whether the buffer was flushed. pg_buffercache_evict_relation(): Evicts all shared buffers in a relation at once. pg_buffercache_evict_all(): Evicts all shared buffers at once. Both functions provide mechanism to evict multiple shared buffers at once. They are designed to address the inefficiency of repeatedly calling pg_buffercache_evict() for each individual buffer, which can be time-consuming when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for 16GB of fully populated shared buffers). These functions are intended for developer testing and debugging purposes and are available to superusers only. Minimal tests for the new functions are included. Also, there was no test for pg_buffercache_evict(), test for this added too. No new extension version is needed, as it was already increased this release by ba2a3c2302f. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com
* Speedup child EquivalenceMember lookup in plannerDavid Rowley2025-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When planning queries to partitioned tables, we clone all EquivalenceMembers belonging to the partitioned table into em_is_child EquivalenceMembers for each non-pruned partition. For partitioned tables with large numbers of partitions, this meant the ec_members list could become large and code searching that list would become slow. Effectively, the more partitions which were present, the more searches needed to be performed for operations such as find_ec_member_matching_expr() during create_plan() and the more partitions present, the longer these searches would take, i.e., a quadratic slowdown. To fix this, here we adjust how we store EquivalenceMembers for em_is_child members. Instead of storing these directly in ec_members, these are now stored in a new array of Lists in the EquivalenceClass, which is indexed by the relid. When we want to find EquivalenceMembers belonging to a certain child relation, we can narrow the search to the array element for that relation. To make EquivalenceMember lookup easier and to reduce the amount of code change, this commit provides a pair of functions to allow iteration over the EquivalenceMembers of an EC which also handles finding the child members, if required. Callers that never need to look at child members can remain using the foreach loop over ec_members, which will now often be faster due to only parent-level members being stored there. The actual performance increases here are highly dependent on the number of partitions and the query being planned. Performance increases can be visible with as few as 8 partitions, but the speedup is marginal for such low numbers of partitions. The speedups become much more visible with a few dozen to hundreds of partitions. With some tested queries using 56 partitions, the planner was around 3x faster than before. For use cases with thousands of partitions, these are likely to become significantly faster. Some testing has shown planner speedups of 60x or more with 8192 partitions. Author: Yuya Watari <watari.yuya@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Tested-by: Thom Brown <thom@linux.com> Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com> Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
* Add pg_buffercache_numa view with NUMA node infoTomas Vondra2025-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduces a new view pg_buffercache_numa, showing NUMA memory nodes for individual buffers. For each buffer the view returns an entry for each memory page, with the associated NUMA node. The database blocks and OS memory pages may have different size - the default block size is 8KB, while the memory page is 4K (on x86). But other combinations are possible, depending on configure parameters, platform, etc. This means buffers may overlap with multiple memory pages, each associated with a different NUMA node. To determine the NUMA node for a buffer, we first need to touch the memory pages using pg_numa_touch_mem_if_required, otherwise we might get status -2 (ENOENT = The page is not present), indicating the page is either unmapped or unallocated. The view may be relatively expensive, especially when accessed for the first time in a backend, as it touches all memory pages to get reliable information about the NUMA node. This may also force allocation of the shared memory. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
* Fix some issues in contrib/spi/refint.c.Tom Lane2025-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | check_foreign_key incorrectly used a single cache entry for its saved plans for a 'c' (cascade) trigger, although there are two different queries to execute depending on whether it fires for an update or a delete. This caused the wrong things to be done if both types of event occur in one session. (This was indeed visible in the triggers regression test, but apparently nobody ever questioned it.) To fix, add the operation type to the cache key. Its debug log output failed to distinguish update from delete events, too. Also, change the intended trigger usage from BEFORE ROW to AFTER ROW, and add checks insisting on that usage. BEFORE is really rather unsafe, since if there are other BEFORE triggers they might change or cancel the operation we are trying to check. AFTER triggers are the standard way to propagate changes to other rows, so we should follow that way here. In passing, remove a useless duplicate lookup of the cache entry. This code is mostly intended as a documentation example, so we won't consider a back-patch. Author: Dmitrii Bondar <d.bondar@postgrespro.ru> Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Lilian Ontowhee <ontowhee@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/79755a2b18ed4fe5e29da6a87a1e00d1@postgrespro.ru