aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Print out error position for CREATE DOMAINMichael Paquier2024-12-16
| | | | | | | | | | | | | This is simply done by pushing down the ParseState available in ProcessUtility() to DefineDomain(), giving more information about the position of an error when running a CREATE DOMAIN query. Most of the queries impacted by this change have been added previously in 0172b4c9449e. Author: Kirill Reshke, Jian He Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
* Add some tests for encoding conversion in COPY TO/FROMMichael Paquier2024-12-16
| | | | | | | | | | | | | | | This adds a couple of tests to trigger encoding conversion when input and server encodings do not match in COPY FROM/TO, or need_transcoding set to true in the COPY state data. These tests rely on UTF8 <-> LATIN1 for the valid cases as LATIN1 accepts any bytes, and UTF8 <-> EUC_JP for some of the invalid cases where a character cannot be understood, causing a conversion failure. Both ENCODING and client_encoding are covered. Test suggested by Andres Freund. Author: Sutou Kouhei Discussion: https://postgr.es/m/20240206222445.hzq22pb2nye7rm67@awork3.anarazel.de
* Declare a couple of variables inside not outside a PG_TRY block.Tom Lane2024-12-15
| | | | | | | | | | | | | I went through the buildfarm's reports of "warning: variable 'foo' might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]". As usual, none of them are live problems according to my understanding of the effects of setjmp/longjmp, to wit that local variables might revert to their values as of PG_TRY entry, due to being kept in registers. But I did happen to notice that XmlTableGetValue's "cstr" variable doesn't need to be declared outside the PG_TRY block at all (thus giving further proof that the -Wclobbered warning has little connection to real problems). We might as well move it inside, and "cur" too, in hopes of eliminating one of the bogus warnings.
* pgbench: fix misprocessing of some nested \if constructs.Tom Lane2024-12-15
| | | | | | | | | | | | | | An \if command appearing within a false (not-to-be-executed) \if branch was incorrectly treated the same as \elif. This could allow statements within the inner \if to be executed when they should not be. Also the missing inner \if stack entry would result in an assertion failure (in assert-enabled builds) when the final \endif is reached. Report and patch by Michail Nikolaev. Back-patch to all supported branches. Discussion: https://postgr.es/m/CANtu0oiA1ke=SP6tauhNqkUdv5QFsJtS1p=aOOf_iU+EhyKkjQ@mail.gmail.com
* Refactor some SQL/JSON error messagesÁlvaro Herrera2024-12-14
| | | | | Turn type names into "%s" specifiers to 1) avoid getting them translated and 2) reduce the total number of messages.
* Fix warnings about declaration of environ on MinGW.Thomas Munro2024-12-15
| | | | | | | | | | | | | | | | | | POSIX says that the global variable environ shouldn't be declared in a header, and that you have to declare it yourself. MinGW declares it in <stdlib.h> with some macrology that messes up our declarations. Visual Studio doesn't warn (there are clues that it may also declare it, but if so, apparently compatibly). Suppress our declarations, on MinGW only. This clears the last warnings on CI's optional MinGW task, and hopefully on build farm animal fairywren too. Like 1319997d, no back-patch for now as it's not known to be breaking anything, and my humble goal is just to keep the MinGW build clean going forward. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJLMh%2B6W5E4M_jSFb43gnrA_-Q6-%2BBf3HkBXyGfRFcBsQ%40mail.gmail.com
* Remove EXTENSION_DONT_CHECK_SIZE from md.c.Thomas Munro2024-12-14
| | | | | | | | | | | | Commits 7bb3102c and 3eb77eba removed the only user of the EXTENSION_DONT_CHECK_SIZE flag, which had previously been required to checkpoint truncated relations. Since 7bb3102c, segments have been opened directly for synchronization without calling _mdfd_getseg(), so it doesn't need a mode that tolerates non-final short segments. Remove the redundant flag and associated comments. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/nyj4k7yur5t27rtygvx2i2lrlp6rqfvvhoiiwx4fznynksf2et%404hj2sp42alpe
* Fix typoJohn Naylor2024-12-14
| | | | | | Ryo Kanbayashi Discussion: https://postgr.es/m/CANOn0ExEQiPVrzkjULkENVac_n4Lknm6dxsU69MSncQap0kJVA%40mail.gmail.com
* Fix possible crash in pg_dump with identity sequences.Tom Lane2024-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an owned sequence is considered interesting, force its owning table to be marked interesting too. This ensures, in particular, that we'll fetch the owning table's column names so we have the data needed for ALTER TABLE ... ADD GENERATED. Previously there were edge cases where pg_dump could get SIGSEGV due to not having filled in the column names. (The known case is where the owning table has been made part of an extension while its identity sequence is not a member; but there may be others.) Also, if it's an identity sequence, force its dumped-components mask to exactly match the owning table: dump definition only if we're dumping the table's definition, dump data only if we're dumping the table's data, etc. This generalizes the code introduced in commit b965f2617 that set the sequence's dump mask to NONE if the owning table's mask is NONE. That's insufficient to prevent failures, because for example the table's mask might only request dumping ACLs, which would lead us to still emit ALTER TABLE ADD GENERATED even though we didn't create the table. It seems better to treat an identity sequence as though it were an inseparable aspect of the table, matching the treatment used in the backend's dependency logic. Perhaps this policy needs additional refinement, but let's wait to see some field use-cases before changing it further. While here, add a comment in pg_dump.h warning against writing tests like "if (dobj->dump == DUMP_COMPONENT_NONE)", which was a bug in this case. There is one other example in getPublicationNamespaces, which if it's not a bug is at least remarkably unclear and under-documented. Changing that requires a separate discussion, however. Per report from Artur Zakirov. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAKNkYnwXFBf136=u9UqUxFUVagevLQJ=zGd5BsLhCsatDvQsKQ@mail.gmail.com
* Rewrite maybe_reread_subscription() commentÁlvaro Herrera2024-12-13
| | | | One sentence was gramatically wrong, but also too terse. Expand on it.
* Dump not-null constraints on inherited columns correctlyÁlvaro Herrera2024-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | With not-null constraints defined in child tables for columns that are coming from their parent tables, we were printing ALTER TABLE SET NOT NULL commands that were missing the constraint name, so the original constraint name was being lost, which is bogus. Fix by instead adding a table-constraint constraint declaration with the correct constraint name in the CREATE TABLE instead. Oversight in commit 14e87ffa5c54. We could have fixed it by changing the ALTER TABLE SET NOT NULL to ALTER TABLE ADD CONSTRAINT, but I'm not sure that's any better. A potential problem here might be that if sent to a non-Postgres server, the new pg_dump output would fail because the "CONSTRAINT foo NOT NULL colname" syntax isn't SQL-conforming. However, Postgres' implementation of inheritance is already non-SQL-conforming, so that'd likely fail anyway. This problem was only noticed by Ashutosh's proposed test framework for pg_dump, https://postgr.es/m/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2Wf61PT+hfquzjBqouRzQJQ@mail.gmail.com Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAExHW5tbdgAKDfqjDJ-7Fk6PJtHg8D4zUF6FQ4H2Pq8zK38Nyw@mail.gmail.com
* Revert "Don't truncate database and user names in startup packets."Nathan Bossart2024-12-12
| | | | | | | | | | | | | | | This reverts commit 562bee0fc13dc95710b8db6a48edad2f3d052f2e. We received a report from the field about this change in behavior, so it seems best to revert this commit and to add proper multibyte-aware truncation as a follow-up exercise. Fixes bug #18711. Reported-by: Adam Rauch Reviewed-by: Tom Lane, Bertrand Drouvot, Bruce Momjian, Thomas Munro Discussion: https://postgr.es/m/18711-7503ee3e449d2c47%40postgresql.org Backpatch-through: 17
* Adjust some comments about structure properties in pg_stat.hMichael Paquier2024-12-12
| | | | | | | | | | | | | | | | | | One comment of PgStat_TableCounts mentioned that its pending stats use memcmp() to check for the all-zero case if there is any activity. This is not true since 07e9e28b56, as pg_memory_is_all_zeros() is used. PgStat_FunctionCounts incorrectly documented that it relied on memcpy(). This has never been correct, and not relevant because function statistics do not have an all-zero check for pending stats. Checkpoint and bgwriter statistics have been always relying on memcmp() or pg_memory_is_all_zeros() (since 07e9e28b56 for the latter), and never mentioned the dependency on event counters for their all-zero checks. Let's document these properties, like the table statistics. Author: Bertrand Drouvot Discussion: https://postgr.es/m/Z1hNLvcPgVLPxCoc@ip-10-97-1-34.eu-west-3.compute.internal
* Detect redundant GROUP BY columns using UNIQUE indexesDavid Rowley2024-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | d4c3a156c added support that when the GROUP BY contained all of the columns belonging to a relation's PRIMARY KEY, all other columns belonging to that relation would be removed from the GROUP BY clause. That's possible because all other columns are functionally dependent on the PRIMARY KEY and those columns alone ensure the groups are distinct. Here we expand on that optimization and allow it to work for any unique indexes on the table rather than just the PRIMARY KEY index. This normally requires that all columns in the index are defined with NOT NULL, however, we can relax that requirement when the index is defined with NULLS NOT DISTINCT. When there are multiple suitable indexes to allow columns to be removed, we prefer the index with the least number of columns as this allows us to remove the highest number of GROUP BY columns. One day, we may want to revisit that decision as it may make more sense to use the narrower set of columns in terms of the width of the data types and stored/queried data. This also adjusts the code to make use of RelOptInfo.indexlist rather than looking up the catalog tables. In passing, add another short-circuit path to allow bailing out earlier in cases where it's certainly not possible to remove redundant GROUP BY columns. This early exit is now cheaper to do than when this code was originally written as 00b41463c made it cheaper to check for empty Bitmapsets. Patch originally by Zhang Mingli and later worked on by jian he, but after I (David) worked on it, there was very little of the original left. Author: Zhang Mingli, jian he, David Rowley Reviewed-by: jian he, Andrei Lepikhov Discussion: https://postgr.es/m/327990c8-b9b2-4b0c-bffb-462249f82de0%40Spark
* Improve the test case from 5668a857dRichard Guo2024-12-12
| | | | | | | | | | | | | | In commit 5668a857d, we fixed an issue with incorrect results in right semi joins and introduced a test case to verify the fix. The test case involves SubPlans and InitPlans, which may not be immediately apparent in relation to the issue we addressed. This patch simplifies the test case with a more straightforward query. Per discussion with Melanie Plageman. Author: Richard Guo Discussion: https://postgr.es/m/CAAKRu_a-Cip2XCXp13fmxq+T9BhLAVApHTyjr94awL2mbXHC-Q@mail.gmail.com
* Add some regression tests for missing DDL patternsMichael Paquier2024-12-12
| | | | | | | | | | | | | | | | | The following commands gain increased coverage for some of the errors they can trigger: - ALTER TABLE .. ALTER COLUMN - CREATE DOMAIN - CREATE TYPE (LIKE) This has come up while discussing the possibility to add more information about the location of the error in such queries, and it is useful on its own as there was no coverage until now for the patterns added in this commit. Author: Jian He, Kirill Reshke Reviewed-By: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
* Defer remove_useless_groupby_columns() work until query_planner()David Rowley2024-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally, remove_useless_groupby_columns() was called during grouping_planner() directly after the call to preprocess_groupclause(). While in many ways, it made sense to populate the field and remove the functionally dependent columns from processed_groupClause at the same time, it's just that doing so had the disadvantage that remove_useless_groupby_columns() was being called before the RelOptInfos were populated for the relations mentioned in the query. Not having RelOptInfos available meant we needed to manually query the catalog tables to get the required details about the primary key constraint for the table. Here we move the remove_useless_groupby_columns() call to query_planner() and put it directly after the RelOptInfos are populated. This is fine to do as processed_groupClause still isn't final at this point as it can still be modified inside standard_qp_callback() by make_pathkeys_for_sortclauses_extended(). This commit is just a refactor and simply moves remove_useless_groupby_columns() into initsplan.c. A planned follow-up commit will adjust that function so it uses RelOptInfo instead of doing catalog lookups and also teach it how to use unique indexes as proofs to expand the cases where we can remove functionally dependent columns from the GROUP BY. Reviewed-by: Andrei Lepikhov, jian he Discussion: https://postgr.es/m/CAApHDvqLezKwoEBBQd0dp4Y9MDkFBDbny0f3SzEeqOFoU7Z5+A@mail.gmail.com
* Add UUID version 7 generation function.Masahiko Sawada2024-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces the uuidv7() SQL function, which generates UUID version 7 as specified in RFC 9652. UUIDv7 combines a Unix timestamp in milliseconds and random bits, offering both uniqueness and sortability. In our implementation, the 12-bit sub-millisecond timestamp fraction is stored immediately after the timestamp, in the space referred to as "rand_a" in the RFC. This ensures additional monotonicity within a millisecond. The rand_a bits also function as a counter. We select a sub-millisecond timestamp so that it monotonically increases for generated UUIDs within the same backend, even when the system clock goes backward or when generating UUIDs at very high frequency. Therefore, the monotonicity of generated UUIDs is ensured within the same backend. This commit also expands the uuid_extract_timestamp() function to support UUID version 7. Additionally, an alias uuidv4() is added for the existing gen_random_uuid() SQL function to maintain consistency. Bump catalog version. Author: Andrey Borodin Reviewed-by: Sergey Prokhorenko, Przemysław Sztoch, Nikolay Samokhvalov Reviewed-by: Peter Eisentraut, Jelte Fennema-Nio, Aleksander Alekseev Reviewed-by: Masahiko Sawada, Lukas Fittl, Michael Paquier, Japin Li Reviewed-by: Marcos Pegoraro, Junwang Zhao, Stepan Neretin Reviewed-by: Daniel Vérité Discussion: https://postgr.es/m/CAAhFRxitJv%3DyoGnXUgeLB_O%2BM7J2BJAmb5jqAT9gZ3bij3uLDA%40mail.gmail.com
* Use pg_memory_is_all_zeros() in pgstatfuncs.c.Nathan Bossart2024-12-11
| | | | | | | | | | | | | There are a few places in this file that use memset() and memcmp() to determine whether a section of memory is all zeros. This commit modifies them to use pg_memory_is_all_zeros() instead. These aren't expected to be hot code paths, but this may optimize them a bit. Plus, this allows us to remove some variables that were only needed for the memset() and memcmp(). Author: Bertrand Drouvot Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/Z1hNubHfvMxlW6eu%40ip-10-97-1-34.eu-west-3.compute.internal
* Unmark gen_random_uuid() function leakproof.Masahiko Sawada2024-12-11
| | | | | | | | | | | | | The functions without arguments don't need to be marked leakproof. This commit unmarks gen_random_uuid() leakproof for consistency with upcoming UUID generation functions. Also, this commit adds a regression test to prevent reintroducing such cases. Bump catalog version. Reported-by: Peter Eisentraut Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAD21AoBE1ePPWY1NQEgk3DkqjYzLPZwYTzCySHm0e%2B9a69PfZw%40mail.gmail.com
* Fix a memory leak in dumping functions with TRANSFORMsDaniel Gustafsson2024-12-11
| | | | | | | | | | | | | The gneration of the dump clause for functions with TRANSFORM calls would leak the memory for holding the result of the Oid array parsing. Fix by freeing. While in there, switch to using pg_malloc instead of palloc in order to be consistent with the rest of the file. Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/baf1ae4511288e5b421f41e79a3df1a0@postgrespro.ru
* Add missing BUFFERS OFF in regression tests, take 2David Rowley2024-12-11
| | | | | | | | Similar to 9fa1aaa65, but running with -D RELCACHE_FORCE_RELEASE and -D CATCACHE_FORCE_RELEASE yielded some additional missing places that needed BUFFERS OFF. Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
* Add missing BUFFERS OFF in select_into regression testsDavid Rowley2024-12-11
| | | | | | | | | | | c2a4078eb adjusted EXPLAIN ANALYZE to include BUFFERS by default, but a few tests in select_into.sql neglected to add BUFFERS OFF. The failing tests seem unlikely to ever access buffers during execution, but they certainly could during planning. Per buildfarm member kestrel, tayra and calliphoridae. Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
* Enable BUFFERS with EXPLAIN ANALYZE by defaultDavid Rowley2024-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option has come up a few times over the past few years. In many ways, doing this seems like a good idea as it may be more obvious to users why a given query is running more slowly than they might expect. Also, from my own (David's) personal experience, I've seen users posting to the mailing lists with two identical plans, one slow and one fast asking why their query is sometimes slow. In many cases, this is due to additional reads. Having BUFFERS on by default may help reduce some of these questions, and if not, make it more obvious to the user before they post, or save a round-trip to the mailing list when additional I/O effort is the cause of the slowness. The general consensus is that we want BUFFERS on by default with ANALYZE. However, there were more than zero concerns raised with doing so. The primary reason against is the additional verbosity, making it harder to read large plans. Another concern was that buffer information isn't always useful so may not make sense to have it on by default. It's currently December, so let's commit this to see if anyone comes forward with a strong objection against making this change. We have over half a year remaining in the v18 cycle where we could still easily consider reverting this if someone were to come forward with a convincing enough reason as to why doing this is a bad idea. There were two patches independently submitted to achieve this goal, one by me and the other by Guillaume. This commit is a mix of both of these patches with some additional work done by me to adjust various additional places in the documentation which include EXPLAIN ANALYZE output. Author: Guillaume Lelarge, David Rowley Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
* Use ExprStates for hashing in GROUP BY and SubPlansDavid Rowley2024-12-11
| | | | | | | | | | | | | | | | | | | | | | This speeds up obtaining hash values for GROUP BY and hashed SubPlans by using the ExprState support for hashing, thus allowing JIT compilation for obtaining hash values for these operations. This, even without JIT compilation, has been shown to improve Hash Aggregate performance in some cases by around 15% and hashed NOT IN queries in one case by over 30%, however, real-world cases are likely to see smaller gains as the test cases used were purposefully designed to have high hashing overheads by keeping the hash table small to prevent additional memory overheads that would be a factor when working with large hash tables. In passing, fix a hypothetical bug in ExecBuildHash32Expr() so that the initial value is stored directly in the ExprState's result field if there are no expressions to hash. None of the current users of this function use an initial value, so the bug is only hypothetical. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CAApHDvpYSO3kc9UryMevWqthTBrxgfd9djiAjKHMPUSQeX9vdQ@mail.gmail.com
* Use in-place updates for pg_restore_relation_stats().Jeff Davis2024-12-10
| | | | | | | | This matches the behavior of vac_update_relstats(), which is important to avoid bloating pg_class. Author: Corey Huinker Discussion: https://postgr.es/m/CADkLM=fc3je+ufv3gsHqjjSSf+t8674RXpuXW62EL55MUEQd-g@mail.gmail.com
* Improve reporting of pg_upgrade log files on test failureMichael Paquier2024-12-11
| | | | | | | | | | | | | | | | | | | | | | | On failure, the pg_upgrade log files are automatically appended to the test log file, but the information reported was inconsistent. A header, with the log file name, was reported with note(), while the log contents and a footer used print(), making it harder to diagnose failures when these are split into console output and test log file because the pg_upgrade log file path in the header may not be included in the test log file. The output is now consolidated so as the header uses print() rather than note(). An extra note() is added to inform that the contents of a pg_upgrade log file are appended to the test log file. The diffs from the regression test suite and dump files all use print() to show their contents on failure. Author: Joel Jacobson Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/49f7e64a-b9be-4a90-a9fe-210a7740405e@app.fastmail.com Backpatch-through: 15
* Speedup Hash Joins with dedicated functions for ExprState hashingDavid Rowley2024-12-11
| | | | | | | | | | | | | | | | | | | Hashing of a single Var is a very common operation for ExprState to perform. Here we add dedicated ExecJust* functions which helps speed up Hash Joins by removing the interpretation overhead in ExecInterpExpr(). This change currently only affects Hash Joins on a single column. Hash Joins with multiple join keys or an expression still run through ExecInterpExpr(). Some testing has shown up to 10% query performance increases on recent AMD hardware and nearly 7% increase on an Apple M2 for a query performing a hash join with a large number of lookups on a small hash table. This change was extracted from a larger patch which adjusts GROUP BY / hashed subplans / hashed set operations to use ExprState hashing. Discussion: https://postgr.es/m/CAApHDvr8Zc0ZgzVoCZLdHGOFNhiJeQ6vrUcS9V7N23zMWQb-eA@mail.gmail.com
* Doc: add some commentary about ExecutorRun's NoMovement special case.Tom Lane2024-12-10
| | | | | | | | | | | Robert Haas expressed concern about whether commit 3eea7a0c9 exposed the parallel-execution machinery to a case it isn't tested for, namely a second non-parallel execution of a plan after a parallel execution. Investigation shows that that can't happen because of pquery.c's manipulation of the scan direction, but it sure wasn't obvious to start with. Add some commentary about that. Discussion: https://postgr.es/m/CA+TgmoagyKQy=HFw+wLo0AKTYWHui+iKswZ8Jnqqd-cFby-WVg@mail.gmail.com
* Fix elog(FATAL) before PostmasterMain() or just after fork().Noah Misch2024-12-10
| | | | | | | | | | Since commit 97550c0711972a9856b5db751539bbaf2f88884c, these failed with "PANIC: proc_exit() called in child process" due to uninitialized or stale MyProcPid. That was reachable if close() failed in ClosePostmasterPorts() or setlocale(category, "C") failed, both unlikely. Back-patch to v13 (all supported versions). Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
* Tests for logical replication with temporal keysPeter Eisentraut2024-12-10
| | | | | | | | | This covers some cases that were previously failing before the "Support for GiST in get_equal_strategy_number()" patch. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Support for GiST in get_equal_strategy_number()Peter Eisentraut2024-12-10
| | | | | | | | | | | | | A WITHOUT OVERLAPS primary key or unique constraint is accepted as a REPLICA IDENTITY, since it guarantees uniqueness. But subscribers applying logical decoding messages would fail because there was not support for looking up the equals operator for a gist index. This fixes that: For GiST indexes we can use the stratnum GiST support function. Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Make the conditions in IsIndexUsableForReplicaIdentityFull() more explicitPeter Eisentraut2024-12-10
| | | | | | | | | | | | | | | | IsIndexUsableForReplicaIdentityFull() described a number of conditions that a suitable index has to fulfill. But not all of these were actually checked in the code. Instead, it appeared to rely on get_equal_strategy_number() to filter out any indexes that are not btree or hash. As we look to generalize index AM capabilities, this would possibly break if we added additional support in get_equal_strategy_number(). Instead, write out code to check for the required capabilities explicitly. This shouldn't change any behaviors at the moment. Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Replace get_equal_strategy_number_for_am() by get_equal_strategy_number()Peter Eisentraut2024-12-10
| | | | | | | | | | | | | | | | | get_equal_strategy_number_for_am() gets the equal strategy number for an AM. This currently only supports btree and hash. In the more general case, this also depends on the operator class (see for example GistTranslateStratnum()). To support that, replace this function with get_equal_strategy_number() that takes an opclass and derives it from there. (This function already existed before as a static function, so the signature is kept for simplicity.) This patch is only a refactoring, it doesn't add support for other index AMs such as gist. This will be done separately. Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Improve internal logical replication error for missing equality strategyPeter Eisentraut2024-12-10
| | | | | | | | | | | This "shouldn't happen", except right now it can with a temporal gist index (to be fixed soon), because of missing gist support in get_equal_strategy_number(). But right now, the error is not caught right away, but instead you get the subsequent error about a "missing operator 0". This makes the error more accurate. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Fix comments of GUC hooks for timezone_abbreviationsMichael Paquier2024-12-10
| | | | | | | | | | | | The GUC assign and check hooks used "assign_timezone_abbreviations", which was incorrect. Issue noticed while browsing this area of the code, introduced in 0a20ff54f5e6. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/Z1eV6Y8yk77GZhZI@paquier.xyz Backpatch-through: 16
* Fix outdated comment of scram_build_secret()Michael Paquier2024-12-10
| | | | | | | | | | | | | | This routine documented that "iterations" would use a default value if set to 0 by the caller. However, the iteration should always be set by the caller to a value strictly more than 0, as documented by an assertion. Oversight in b577743000cd, that has made the iteration count of SCRAM configurable. Author: Matheus Alcantara Discussion: https://postgr.es/m/ac858943-4743-44cd-b4ad-08a0c10cbbc8@gmail.com Backpatch-through: 16
* Include necessary header files in radixtree.h.Masahiko Sawada2024-12-09
| | | | | | | | | | | | | | | When #include'ing radixtree.h with RT_SHMEM, it could happen to raise compiler errors due to missing some declarations of types and functions. This commit also removes the inclusion of postgres.h since it's against our usual convention. Backpatch to v17, where radixtree.h was introduced. Reviewed-by: Heikki Linnakangas, Álvaro Herrera Discussion: https://postgr.es/m/CAD21AoCU9YH%2Bb9Rr8YRw7UjmB%3D1zh8GKQkWNiuN9mVhMvkyrRg%40mail.gmail.com Backpatch-through: 17
* Fix small memory leaks in GUC checksDaniel Gustafsson2024-12-09
| | | | | | | | | | | | Follow-up commit to a9d58bfe8a3a. Backpatch down to v16 where this was added in order to keep the code consistent for future backpatches. Author: Tofig Aliev <t.aliev@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/bba4313fdde9db46db279f96f3b748b1@postgrespro.ru Backpatch-through: 16
* Fix various overflow hazards in date and timestamp functions.Nathan Bossart2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | This commit makes use of the overflow-aware routines in int.h to fix a variety of reported overflow bugs in the date and timestamp code. It seems unlikely that this fixes all such bugs in this area, but since the problems seem limited to cases that are far beyond any realistic usage, I'm not going to worry too much. Note that for one bug, I've chosen to simply add a comment about the overflow hazard because fixing it would require quite a bit of code restructuring that doesn't seem worth the risk. Since this is a bug fix, it could be back-patched, but given the risk of conflicts with the new routines in int.h and the overall risk/reward ratio of this patch, I've opted not to do so for now. Fixes bug #18585 (except for the one case that's just commented). Reported-by: Alexander Lakhin Author: Matthew Kim, Nathan Bossart Reviewed-by: Joseph Koshakow, Jian He Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com Discussion: https://postgr.es/m/18585-db646741dd649abd%40postgresql.org
* Simplify executor's determination of whether to use parallelism.Tom Lane2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our parallel-mode code only works when we are executing a query in full, so ExecutePlan must disable parallel mode when it is asked to do partial execution. The previous logic for this involved passing down a flag (variously named execute_once or run_once) from callers of ExecutorRun or PortalRun. This is overcomplicated, and unsurprisingly some of the callers didn't get it right, since it requires keeping state that not all of them have handy; not to mention that the requirements for it were undocumented. That led to assertion failures in some corner cases. The only state we really need for this is the existing QueryDesc.already_executed flag, so let's just put all the responsibility in ExecutePlan. (It could have been done in ExecutorRun too, leading to a slightly shorter patch -- but if there's ever more than one caller of ExecutePlan, it seems better to have this logic in the subroutine than the callers.) This makes those ExecutorRun/PortalRun parameters unnecessary. In master it seems okay to just remove them, returning the API for those functions to what it was before parallelism. Such an API break is clearly not okay in stable branches, but for them we can just leave the parameters in place after documenting that they do nothing. Per report from Yugo Nagata, who also reviewed and tested this patch. Back-patch to all supported branches. Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
* Remove remants of "snapshot too old"Heikki Linnakangas2024-12-09
| | | | | | | | | | | | | | | | | Remove the 'whenTaken' and 'lsn' fields from SnapshotData. After the removal of the "snapshot too old" feature, they were never set to a non-zero value. This largely reverts commit 3e2f3c2e423, which added the OldestActiveSnapshot tracking, and the init_toast_snapshot() function. That was only required for setting the 'whenTaken' and 'lsn' fields. SnapshotToast is now a constant again, like SnapshotSelf and SnapshotAny. I kept a thin get_toast_snapshot() wrapper around SnapshotToast though, to check that you have a registered or active snapshot. That's still a useful sanity check. Reviewed-by: Nathan Bossart, Andres Freund, Tom Lane Discussion: https://www.postgresql.org/message-id/cd4b4f8c-e63a-41c0-95f6-6e6cd9b83f6d@iki.fi
* Avoid unnecessary wrapping for Vars and PHVsRichard Guo2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | When pulling up a lateral subquery that is under an outer join, the current code always wraps a Var or PHV in the subquery's targetlist into a new PlaceHolderVar if it is a lateral reference to something outside the subquery. This is necessary when the Var/PHV references the non-nullable side of the outer join from the nullable side: we need to ensure that it is evaluated at the right place and hence is forced to null when the outer join should do so. However, if the referenced rel is under the same lowest nulling outer join, we can actually omit the wrapping. That's safe because if the subquery variable is forced to NULL by the outer join, the lateral reference variable will come out as NULL too. It could be beneficial to get rid of such PHVs because they imply lateral dependencies, which force us to resort to nestloop joins. This patch leverages the newly introduced nullingrel_info to check if the nullingrels of the subquery RTE are a subset of those of the laterally referenced rel, in order to determine if the referenced rel is under the same lowest nulling outer join. No backpatch as this could result in plan changes. Author: Richard Guo Reviewed-by: James Coleman, Dmitry Dolgov, Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48uk6C7Z9m_FNT8_21CMCk68hrgAsz=z6zpP1PNZMkeoQ@mail.gmail.com
* Fix right-semi-joins in HashJoin rescansRichard Guo2024-12-09
| | | | | | | | | | | | | | | | | When resetting a HashJoin node for rescans, if it is a single-batch join and there are no parameter changes for the inner subnode, we can just reuse the existing hash table without rebuilding it. However, for join types that depend on the inner-tuple match flags in the hash table, we need to reset these match flags to avoid incorrect results. This applies to right, right-anti, right-semi, and full joins. When I introduced "Right Semi Join" plan shapes in aa86129e1, I failed to reset the match flags in the hash table for right-semi joins in rescans. This oversight has been shown to produce incorrect results. This patch fixes it. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4-nQF9io2WL2SkD0eXvfPdyBc9Q=hRwfQHCGV2usa0jyA@mail.gmail.com
* Fix memory leak in pgoutput with publication list cacheMichael Paquier2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pgoutput module caches publication names in a list and frees it upon invalidation. However, the code forgot to free the actual publication names within the list elements, as publication names are pstrdup()'d in GetPublication(). This would cause memory to leak in CacheMemoryContext, bloating it over time as this context is not cleaned. This is a problem for WAL senders running for a long time, as an accumulation of invalidation requests would bloat its cache memory usage. A second case, where this leak is easier to see, involves a backend calling SQL functions like pg_logical_slot_{get,peek}_changes() which create a new decoding context with each execution. More publications create more bloat. To address this, this commit adds a new memory context within the logical decoding context and resets it each time the publication names cache is invalidated, based on a suggestion from Amit Kapila. This ensures that the lifespan of the publication names aligns with that of the logical decoding context. This solution changes PGOutputData, which is fine for HEAD but it could cause an ABI breakage in stable branches as the structure size would change, so these are left out for now. Analyzed-by: Michael Paquier, Jeff Davis Author: Zhijie Hou Reviewed-by: Michael Paquier, Masahiko Sawada, Euler Taveira Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
* Improve comment about dropped entries in pgstat.cMichael Paquier2024-12-09
| | | | | | | | | | | | | | | | | | | | | pgstat_write_statsfile() discards any entries marked as dropped from being written to the stats file at shutdown, and also included an assertion based on the same condition. The intention of the assertion is to track that no pgstats entries should be left around as terminating backends should drop any entries they still hold references on before the stats file is written by the checkpointer, and it not worth taking down the server in this case if there is a bug making that possible. Let's improve the comment of this area to document clearly what's intended. Based on a discussion with Bertrand Drouvot and Anton A. Melnikov. Author: Bertrand Drouvot Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru Backpatch-through: 15
* Improve the error message introduced in commit 87ce27de696.Amit Kapila2024-12-09
| | | | | | | | | | | The error detail message "Replica identity consists of an unpublished generated column." implies that the entire replica identity is made up of an unpublished generated column which may not be the case. Reported-by: Peter Smith Author: Shlok Kyal Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CAHut+PuwMhKx0PhOA4APhJTLoBGNykbeCQpr_CuwGT-SkswG5w@mail.gmail.com
* Fix invalidation of local pgstats references for entry reinitializationMichael Paquier2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 818119afccd3 has introduced the "generation" concept in pgstats entries, incremented a counter when a pgstats entry is reinitialized, but it did not count on the fact that backends still holding local references to such entries need to be refreshed if the cache age is outdated. The previous logic only updated local references when an entry was dropped, but it needs also to consider entries that are reinitialized. This matters for replication slot stats (as well as custom pgstats kinds in 18~), where concurrent drops and creates of a slot could cause incorrect stats to be locally referenced. This would lead to an assertion failure at shutdown when writing out the stats file, as the backend holding an outdated local reference would not be able to drop during its shutdown sequence the stats entry that should be dropped, as the last process holding a reference to the stats entry. The checkpointer was then complaining about such an entry late in the shutdown sequence, after the shutdown checkpoint is finished with the control file updated, causing the stats file to not be generated. In non-assert builds, the entry would just be skipped with the stats file written. Note that only logical replication slots use statistics. A test case based on TAP is added to test_decoding, where a persistent connection peeking at a slot's data is kept with concurrent drops and creates of the same slot. This is based on the isolation test case that Anton has sent. As it requires a node shutdown with a check to make sure that the stats file is written with this specific sequence of events, TAP is used instead. Reported-by: Anton A. Melnikov Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru Backpatch-through: 15
* Fix possible crash during WindowAgg evaluationDavid Rowley2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When short-circuiting WindowAgg node evaluation on the top-level WindowAgg node using quals on monotonic window functions, because the WindowAgg run condition can mean there's no need to evaluate subsequent window function results in the same partition once the run condition becomes false, it was possible that the executor would use stale results from the previous invocation of the window function in some cases. A fix for this was partially done by a5832722, but that commit only fixed the issue for non-top-level WindowAgg nodes. I mistakenly thought that the top-level WindowAgg didn't have this issue, but Jayesh's example case clearly shows that's incorrect. At the time, I also thought that this only affected 32-bit systems as all window functions which then supported run conditions returned BIGINT, however, that's wrong as ExecProject is still called and that could cause evaluation of any other window function belonging to the same WindowAgg node, one of which may return a byref type. The only queries affected by this are WindowAggs with a "Run Condition" which contains at least one window function with a byref result type, such as lead() or lag() on a byref column. The window clause must also contain a PARTITION BY clause (without a PARTITION BY, execution of the WindowAgg stops immediately when the run condition becomes false and there's no risk of using the stale results). Reported-by: Jayesh Dehankar Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com Backpatch-through: 15, where WindowAgg run conditions were added
* Ensure that pg_amop/amproc entries depend on their lefttype/righttype.Tom Lane2024-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Usually an entry in pg_amop or pg_amproc does not need a dependency on its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types, because there is an indirect dependency via the argument types of its referenced operator or procedure, or via the opclass it belongs to. However, for some support procedures in some index AMs, the argument types of the support procedure might not mention the column data type at all. Also, the amop/amproc entry might be treated as "loose" in the opfamily, in which case it lacks a dependency on any particular opclass; or it might be a cross-type entry having a reference to a datatype that is not its opclass' opcintype. The upshot of all this is that there are cases where a datatype can be dropped while leaving behind amop/amproc entries that mention it, because there is no path in pg_depend showing that those entries depend on that type. Such entries are harmless in normal activity, because they won't get used, but they cause problems for maintenance actions such as dropping the operator family. They also cause pg_dump to produce bogus output. The previous commit put a band-aid on the DROP OPERATOR FAMILY failure, but a real fix is needed. To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry depends on its lefttype/righttype. To avoid bloating pg_depend too much, skip this if the referenced operator or function has that type as an input type. (I did not bother with considering the possible indirect dependency via the opclass' opcintype; at least in the reported case, that wouldn't help anyway.) Probably, the reason this has escaped notice for so long is that add-on datatypes and relevant opclasses/opfamilies are usually packaged as extensions nowadays, so that there's no way to drop a type without dropping the referencing opclasses/opfamilies too. Still, in the absence of pg_depend entries there's nothing that constrains DROP EXTENSION to drop the opfamily entries before the datatype, so it seems possible for a DROP failure to occur anyway. The specific case that was reported doesn't fail in v13, because v13 prefers to attach the support procedure to the opclass not the opfamily. But it's surely possible to construct other edge cases that do fail in v13, so patch that too. Per report from Yoran Heling. Back-patch to all supported branches. Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021