postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Improve comment about dropped entries in pgstat.c	Michael Paquier	2024-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pgstat_write_statsfile() discards any entries marked as dropped from being written to the stats file at shutdown, and also included an assertion based on the same condition. The intention of the assertion is to track that no pgstats entries should be left around as terminating backends should drop any entries they still hold references on before the stats file is written by the checkpointer, and it not worth taking down the server in this case if there is a bug making that possible. Let's improve the comment of this area to document clearly what's intended. Based on a discussion with Bertrand Drouvot and Anton A. Melnikov. Author: Bertrand Drouvot Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru Backpatch-through: 15
*	Improve the error message introduced in commit 87ce27de696.	Amit Kapila	2024-12-09
\| \| \| \| \| \| \| \| \| \| \|	The error detail message "Replica identity consists of an unpublished generated column." implies that the entire replica identity is made up of an unpublished generated column which may not be the case. Reported-by: Peter Smith Author: Shlok Kyal Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CAHut+PuwMhKx0PhOA4APhJTLoBGNykbeCQpr_CuwGT-SkswG5w@mail.gmail.com
*	Fix invalidation of local pgstats references for entry reinitialization	Michael Paquier	2024-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	818119afccd3 has introduced the "generation" concept in pgstats entries, incremented a counter when a pgstats entry is reinitialized, but it did not count on the fact that backends still holding local references to such entries need to be refreshed if the cache age is outdated. The previous logic only updated local references when an entry was dropped, but it needs also to consider entries that are reinitialized. This matters for replication slot stats (as well as custom pgstats kinds in 18~), where concurrent drops and creates of a slot could cause incorrect stats to be locally referenced. This would lead to an assertion failure at shutdown when writing out the stats file, as the backend holding an outdated local reference would not be able to drop during its shutdown sequence the stats entry that should be dropped, as the last process holding a reference to the stats entry. The checkpointer was then complaining about such an entry late in the shutdown sequence, after the shutdown checkpoint is finished with the control file updated, causing the stats file to not be generated. In non-assert builds, the entry would just be skipped with the stats file written. Note that only logical replication slots use statistics. A test case based on TAP is added to test_decoding, where a persistent connection peeking at a slot's data is kept with concurrent drops and creates of the same slot. This is based on the isolation test case that Anton has sent. As it requires a node shutdown with a check to make sure that the stats file is written with this specific sequence of events, TAP is used instead. Reported-by: Anton A. Melnikov Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru Backpatch-through: 15
*	Fix possible crash during WindowAgg evaluation	David Rowley	2024-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When short-circuiting WindowAgg node evaluation on the top-level WindowAgg node using quals on monotonic window functions, because the WindowAgg run condition can mean there's no need to evaluate subsequent window function results in the same partition once the run condition becomes false, it was possible that the executor would use stale results from the previous invocation of the window function in some cases. A fix for this was partially done by a5832722, but that commit only fixed the issue for non-top-level WindowAgg nodes. I mistakenly thought that the top-level WindowAgg didn't have this issue, but Jayesh's example case clearly shows that's incorrect. At the time, I also thought that this only affected 32-bit systems as all window functions which then supported run conditions returned BIGINT, however, that's wrong as ExecProject is still called and that could cause evaluation of any other window function belonging to the same WindowAgg node, one of which may return a byref type. The only queries affected by this are WindowAggs with a "Run Condition" which contains at least one window function with a byref result type, such as lead() or lag() on a byref column. The window clause must also contain a PARTITION BY clause (without a PARTITION BY, execution of the WindowAgg stops immediately when the run condition becomes false and there's no risk of using the stale results). Reported-by: Jayesh Dehankar Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com Backpatch-through: 15, where WindowAgg run conditions were added
*	Ensure that pg_amop/amproc entries depend on their lefttype/righttype.	Tom Lane	2024-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Usually an entry in pg_amop or pg_amproc does not need a dependency on its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types, because there is an indirect dependency via the argument types of its referenced operator or procedure, or via the opclass it belongs to. However, for some support procedures in some index AMs, the argument types of the support procedure might not mention the column data type at all. Also, the amop/amproc entry might be treated as "loose" in the opfamily, in which case it lacks a dependency on any particular opclass; or it might be a cross-type entry having a reference to a datatype that is not its opclass' opcintype. The upshot of all this is that there are cases where a datatype can be dropped while leaving behind amop/amproc entries that mention it, because there is no path in pg_depend showing that those entries depend on that type. Such entries are harmless in normal activity, because they won't get used, but they cause problems for maintenance actions such as dropping the operator family. They also cause pg_dump to produce bogus output. The previous commit put a band-aid on the DROP OPERATOR FAMILY failure, but a real fix is needed. To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry depends on its lefttype/righttype. To avoid bloating pg_depend too much, skip this if the referenced operator or function has that type as an input type. (I did not bother with considering the possible indirect dependency via the opclass' opcintype; at least in the reported case, that wouldn't help anyway.) Probably, the reason this has escaped notice for so long is that add-on datatypes and relevant opclasses/opfamilies are usually packaged as extensions nowadays, so that there's no way to drop a type without dropping the referencing opclasses/opfamilies too. Still, in the absence of pg_depend entries there's nothing that constrains DROP EXTENSION to drop the opfamily entries before the datatype, so it seems possible for a DROP failure to occur anyway. The specific case that was reported doesn't fail in v13, because v13 prefers to attach the support procedure to the opclass not the opfamily. But it's surely possible to construct other edge cases that do fail in v13, so patch that too. Per report from Yoran Heling. Back-patch to all supported branches. Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
*	Make getObjectDescription robust against dangling amproc type links.	Tom Lane	2024-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yoran Heling reported a case where a data type could be dropped while references to its OID remain behind in pg_amproc. This causes getObjectDescription to fail, which blocks dropping the operator family (since our DROP code likes to construct descriptions of everything it's dropping). The proper fix for this requires adding more pg_depend entries. But to allow DROP to go through with already-corrupt catalogs, tweak getObjectDescription to print "???" for the type instead of failing when it processes such an entry. I changed the logic for pg_amop similarly, for consistency, although it is not known that the problem can manifest in pg_amop. Per report from Yoran Heling. Back-patch to all supported branches (although the problem may be unreachable in v13). Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
*	Fix is_digit labeling of to_timestamp's FFn format codes.	Tom Lane	2024-12-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These format codes produce or consume strings of digits, so they should be labeled with is_digit = true, but they were not. This has effect in only one place, where is_next_separator() is checked to see if the preceding format code should slurp up all the available digits. Thus, with a format such as '...SSFF3' with remaining input '12345', the 'SS' code would consume all five digits (and then complain about seconds being out of range) when it should eat only two digits. Per report from Nick Davies. This bug goes back to d589f9446 where the FFn codes were introduced, so back-patch to v13. Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com
*	Comment fix: "buffer context lock" to "buffer content lock".	Jeff Davis	2024-12-06
\| \| \| \|	The term "buffer context lock" is outdated as of commit 5d5087363d.
*	Remove useless casts to (const void *)	Peter Eisentraut	2024-12-06
\| \| \| \| \|	Similar to commit 7f798aca1d5, but I didn't think to look for "const" as well.
*	Remove pg_regex_collation	Peter Eisentraut	2024-12-05
\| \| \| \| \| \| \| \|	We can also use the existing pg_regex_locale as the cache key, which is the only use of this variable. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/b1b92ae1-2e06-4619-a87a-4b4858e547ec%40eisentraut.org
*	Provide a better error message for misplaced dispatch options.	Nathan Bossart	2024-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch, misplacing a special must-be-first option for dispatching to a subprogram (e.g., postgres -D . --single) would fail with an error like FATAL: --single requires a value This patch adjusts this error to more accurately complain that the special option wasn't listed first. The aforementioned error message now looks like FATAL: --single must be first argument The dispatch option parsing code has been refactored for use wherever ParseLongOption() is called. Beyond the obvious advantage of avoiding code duplication, this should prevent similar problems when new dispatch options are added. Note that we assume that none of the dispatch option names match another valid command-line argument, such as the name of a configuration parameter. Ideally, we'd remove this must-be-first requirement for these options, but after some investigation, we decided that wasn't worth the added complexity and behavior changes. Author: Nathan Bossart, Greg Sabino Mullane Reviewed-by: Greg Sabino Mullane, Peter Eisentraut, Álvaro Herrera, Tom Lane Discussion: https://postgr.es/m/CAKAnmmJkZtZAiSryho%3DgYpbvC7H-HNjEDAh16F3SoC9LPu8rqQ%40mail.gmail.com
*	Fix dead code	Peter Eisentraut	2024-12-04
\| \| \| \| \| \|	from commit 85b7efa1cdd per Coverity report
*	Fix use-after-free in parallel_vacuum_reset_dead_items	John Naylor	2024-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	parallel_vacuum_reset_dead_items used a local variable to hold a pointer from the passed vacrel, purely as a shorthand. This pointer was later freed and a new allocation was made and stored to the struct. Then the local pointer was mistakenly referenced again. This apparently happened not to break anything since the freed chunk would have been put on the context's freelist, so it was accidentally the same pointer anyway, in which case the DSA handle was correctly updated. The minimal fix is to change two places so they access dead_items through the vacrel. This coding style is a maintenance hazard, so while at it get rid of most other similar usages, which were inconsistently used anyway. Analysis and patch by Vallimaharajan G, with further defensive coding by me Backpath to v17, when TidStore came in Discussion: https://postgr.es/m/1936493cc38.68cb2ef27266.7456585136086197135@zohocorp.com
*	Simplify IsIndexUsableForReplicaIdentityFull()	Peter Eisentraut	2024-12-04
\| \| \| \| \| \| \| \| \| \|	Take Relation as argument instead of IndexInfo. Building the IndexInfo is an unnecessary intermediate step here. A future patch wants to get some information that is in the relcache but not in IndexInfo, so this will also help there. Discussion: https://www.postgresql.org/message-id/333d3886-b737-45c3-93f4-594c96bb405d@eisentraut.org
*	Ensure stored generated columns must be published when required.	Amit Kapila	2024-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ensure stored generated columns that are part of REPLICA IDENTITY must be published explicitly for UPDATE and DELETE operations to be published. We can publish generated columns by listing them in the column list or by enabling the publish_generated_columns option. This commit changes the behavior of the test added in commit adedf54e65 by giving an ERROR for the UPDATE operation in such cases. There is no way to trigger the bug reported in commit adedf54e65 but we didn't remove the corresponding code change because it is still relevant when replicating changes from a publisher with version less than 18. We decided not to backpatch this behavior change to avoid the risk of breaking existing output plugins that may be sending generated columns by default although we are not aware of any such plugin. Also, we didn't see any reports related to this on STABLE branches which is another reason not to backpatch this change. Author: Shlok Kyal, Hou Zhijie Reviewed-by: Vignesh C, Amit Kapila Discussion: https://postgr.es/m/CANhcyEVw4V2Awe2AB6i0E5AJLNdASShGfdBLbUd1XtWDboymCA@mail.gmail.com
*	Move check for ucol_strcollUTF8 to pg_locale_icu.c	Jeff Davis	2024-12-03
\| \| \| \| \| \| \|	The result of the check is only used by pg_locale_icu.c. Author: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se
*	Fix synchronized_standby_slots GUC check hook	Álvaro Herrera	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The validate_sync_standby_slots subroutine requires an LWLock, so it cannot run in processes without PGPROC; skip it there to avoid a crash. This replaces the current test for ReplicationSlotCtl being not null, which appears to be a solution for the same problem but less general. I also rewrote a related comment that mentioned ReplicationSlotCtl in StandbySlotsHaveCaughtup. This code came in with commit bf279ddd1c28; backpatch to 17. Reported-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/202411281216.sutbxtr6idnn@alvherre.pgsql
*	Drop "Lock" suffix from LWLock wait event names	Álvaro Herrera	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit da952b415f44 unintentially reverted the SQL-visible part of commit 14a910109126, which breaks queries joining pg_wait_events with pg_stat_acivity. Remove the suffix again. Backpatch to 17. Reported-by: Christophe Courtois <christophe.courtois@dalibo.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/18728-450924477056a339%40postgresql.org Discussion: https://postgr.es/m/Z01w1+LihtRiS0Te@ip-10-97-1-34.eu-west-3.compute.internal
*	Fix handling of CREATE DOMAIN with GENERATED constraint syntax	Peter Eisentraut	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Stuff like CREATE DOMAIN foo AS int CONSTRAINT cc GENERATED ALWAYS AS (2) STORED is not supported for domains, but the parser allows it, because it's the same syntax as for table constraints. But CreateDomain() did not explicitly handle all ConstrType values, so the above would get an internal error like ERROR: unrecognized constraint subtype: 4 Fix that by providing a user-facing error message for all ConstrType values. Also, remove the switch default case, so future additions to ConstrType are caught. Reported-by: Jian He <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/CACJufxF8fmM=Dbm4pDFuV_nKGz2-No0k4YifhrF3-rjXTWJM3w@mail.gmail.com
*	Fix temporary memory leak in system table index scans	Peter Eisentraut	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 811af9786b introduced palloc() calls into systable_beginscan() and systable_beginscan_ordered(). But there was no pfree(), as is the usual style. It turns out that an ANALYZE of a partitioned table can invoke many thousand system table index scans, and this memory is not cleaned up until the end of the command, so this can temporarily leak quite a bit of memory. Maybe there are improvements to be made at a higher level about this, but for now, insert a couple of corresponding pfree() calls to fix this particular issue. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://www.postgresql.org/message-id/Z0XTfIq5xUtbkiIh@pryzbyj2023
*	Perform provider-specific initialization in new functions.	Jeff Davis	2024-12-02
\| \| \| \| \|	Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se
*	Fix unintentional behavior change in commit e9931bfb75.	Jeff Davis	2024-12-02
\| \| \| \| \| \| \| \| \|	Prior to that commit, there was special case to use ASCII case mapping behavior for the libc provider with a single-byte encoding when that's the default collation. Commit e9931bfb75 mistakenly eliminated that special case; this commit restores it. Discussion: https://postgr.es/m/01a104f0d2179d756261e90d96fd65c36ad6fcf0.camel@j-davis.com
*	Revert "Introduce CompactAttribute array in TupleDesc"	David Rowley	2024-12-03
\| \| \| \| \| \| \| \| \| \|	This reverts commit d28dff3f6cd6a7562fb2c211ac0fb74a33ffd032. Quite a large number of buildfarm members didn't like this commit and it's not yet clear why. Reverting this before too many animals turn red. Discussion: https://postgr.es/m/CAApHDvr9i6T5=iAwQCxFDgMsthr_obVxgwBaEJkC8KUH6yM3Hw@mail.gmail.com
*	Introduce CompactAttribute array in TupleDesc	David Rowley	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. With this change, NAMEDATALEN could be increased with a much smaller negative impact on performance. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com Reviewed-by: Andres Freund, Victor Yegorov
*	RelationTruncate() must set DELAY_CHKPT_START.	Thomas Munro	2024-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, it set only DELAY_CHKPT_COMPLETE. That was important, because it meant that if the XLOG_SMGR_TRUNCATE record preceded a XLOG_CHECKPOINT_ONLINE record in the WAL, then the truncation would also happen on disk before the XLOG_CHECKPOINT_ONLINE record was written. However, it didn't guarantee that the sync request for the truncation was processed before the XLOG_CHECKPOINT_ONLINE record was written. By setting DELAY_CHKPT_START, we guarantee that if an XLOG_SMGR_TRUNCATE record is written to WAL before the redo pointer of a concurrent checkpoint, the sync request queued by that operation must be processed by that checkpoint, rather than being left for the following one. This is a refinement of commit 412ad7a5563. Back-patch to all supported releases, like that commit. Author: Robert Haas <robertmhaas@gmail.com> Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2B-2rjGZC2kwqr2NMLBcEBp4uf59QT1advbWYF_uc%2B0Aw%40mail.gmail.com
*	Deprecate MD5 passwords.	Nathan Bossart	2024-12-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: Greg Sabino Mullane, Jim Nasby Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan
*	Add a planner support function for numeric generate_series().	Dean Rasheed	2024-12-02
\| \| \| \| \| \| \| \| \| \| \|	This allows the planner to estimate the number of rows returned by generate_series(numeric, numeric[, numeric]), when the input values can be estimated at plan time. Song Jinzhou, reviewed by Dean Rasheed and David Rowley. Discussion: https://postgr.es/m/tencent_F43E7F4DD50EF5986D1051DE8DE547910206%40qq.com Discussion: https://postgr.es/m/tencent_1F6D5B9A1545E02FD7D0EE508DFD056DE50A%40qq.com
*	Fix #include order in timestamp.c.	Dean Rasheed	2024-12-02
\| \| \| \|	Oversight in 036bdcec9f.
*	Fix error code for referential action RESTRICT	Peter Eisentraut	2024-12-02
\| \| \| \| \| \| \| \| \| \| \| \|	According to the SQL standard, if the referential action RESTRICT is triggered, it has its own error code. We previously didn't use that, we just used the error code for foreign key violation. But RESTRICT is not necessarily an actual foreign key violation. The foreign key might still be satisfied in theory afterwards, but the RESTRICT setting prevents the action even then. So it's a separate kind of error condition. Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
*	Avoid mislabeling of lateral references, redux.	Tom Lane	2024-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As I'd feared, commit 5c9d8636d was still a few bricks shy of a load. We can't just leave pulled-up lateral-reference Vars with no new nullingrels: we have to carefully compute what subset of the to-be-replaced Var's nullingrels apply to them, else we still get "wrong varnullingrels" errors. This is a bit tedious, but it looks like we can use the nullingrel data this patch computes for other purposes, enabling better optimization. We don't want to inject unnecessary plan changes into stable branches though, so leave that idea for a later HEAD-only patch. Patch by me, but thanks to Richard Guo for devising a test case that broke 5c9d8636d, and for preliminary investigation about how to fix it. As before, back-patch to v16. Discussion: https://postgr.es/m/E1tGn4j-0003zi-MP@gemulon.postgresql.org
*	Small indenting fixes in jsonpath_scan.l	Peter Eisentraut	2024-11-29
\| \| \| \| \| \|	Some lines were indented by an inconsistent number of spaces. While we're here, also fix some code that used the newline after left parenthesis style, which is obsolete.
*	Skip not SOAP-supported indexes while transforming an OR clause into SAOP	Alexander Korotkov	2024-11-29
\| \| \| \| \| \| \| \| \| \| \| \|	There is no point in transforming OR-clauses into SAOP's if the target index doesn't support SAOP scans anyway. This commit adds corresponding checks to match_orclause_to_indexcol() and group_similar_or_args(). The first check fixes the actual bug, while the second just saves some cycles. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/8174de69-9e1a-0827-0e81-ef97f56a5939%40gmail.com Author: Alena Rybakina Reviewed-by: Ranier Vilela, Alexander Korotkov, Andrei Lepikhov
*	Fix typo in header comment for set_operation_ordered_results_useful	David Rowley	2024-11-29
\| \| \| \| \|	Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs492vMy3XNjDZRtqtHfFTK6HVeDwhrEQH7eXGgF_h5Jnzw@mail.gmail.com
*	Avoid mislabeling of lateral references when pulling up a subquery.	Tom Lane	2024-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are pulling up a subquery that's under an outer join, and the subquery's target list contains a strict expression that uses both a subquery variable and a lateral-reference variable, it's okay to pull up the expression without wrapping it in a PlaceHolderVar. That's safe because if the subquery variable is forced to NULL by the outer join, the expression result will come out as NULL too, so we don't have to force that outcome by evaluating the expression below the outer join. It'd be correct to wrap in a PHV, but that can lead to very significantly worse plans, since we'd then have to use a nestloop plan to pass down the lateral reference to where the expression will be evaluated. However, when we do that, we should not mark the lateral reference variable as being nulled by the outer join, because it isn't after we pull up the expression in this way. So the marking logic added by cb8e50a4a was incorrect in this detail, leading to "wrong varnullingrels" errors from the consistency-checking logic in setrefs.c. It seems to be sufficient to just not mark lateral references at all in this case. (I have a nagging feeling that more complexity may be needed in cases where there are several levels of outer join, but some attempts to break it with that didn't succeed.) Per report from Bertrand Mamasam. Back-patch to v16, as the previous patch was. Discussion: https://postgr.es/m/CACZ67_UA_EVrqiFXJu9XK50baEpH=ofEPJswa2kFxg6xuSw-ww@mail.gmail.com
*	Remove useless casts to (void *)	Peter Eisentraut	2024-11-28
\| \| \| \| \| \| \| \|	Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
*	Require sizeof(bool) == 1.	Thomas Munro	2024-11-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The C standard says that sizeof(bool) is implementation-defined, but we know of no current systems where it is not 1. The last known systems seem to have been Apple macOS/PowerPC 10.5 and Microsoft Visual C++ 4, both long defunct. PostgreSQL has always required sizeof(bool) == 1 for the definition of bool that it used, but previously it would define its own type if the system-provided bool had a different size. That was liable to cause memory layout problems when interacting with system and third-party libraries on (by now hypothetical) computers with wider _Bool, and now C23 has introduced a new problem by making bool a built-in datatype (like C++), so the fallback code doesn't even compile. We could probably work around that, but then we'd be writing new untested code for a computer that doesn't exist. Instead, delete the unreachable and C23-uncompilable fallback code, and let existing static assertions fail if the system-provided bool is too wide. If we ever get a problem report from a real system, then it will be time to figure out what to do about it in a way that also works on modern compilers. Note on C++: Previously we avoided including <stdbool.h> or trying to define a new bool type in headers that might be included by C++ code. These days we might as well just include <stdbool.h> unconditionally: it should be visible to C++11 but do nothing, just as in C23. We already include <stdint.h> without C++ guards in c.h, and that falls under the same C99-compatibility section of the C++11 standard as <stdbool.h>, so let's remove the guards here too. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3198438.1731895163%40sss.pgh.pa.us
*	Make GUC_check_errdetail messages full sentences	Álvaro Herrera	2024-11-27
\| \| \| \| \| \| \|	They were all missing punctuation, one was missing initial capital. Per our message style guidelines. No backpatch, to avoid breaking existing translations.
*	Remove redundant relam initialization	Álvaro Herrera	2024-11-27
\| \| \| \| \| \| \| \|	This struct member is initialized again a few lines below in the same function. This is cosmetic, so no backpatch. Reported-by: Jingtang Zhang <mrdrivingduck@gmail.com> Discussion: https://postgr.es/m/AFF74506-B925-46BB-B875-CF5A946170EB@gmail.com
*	Look up backend type in pg_signal_backend() more cheaply.	Nathan Bossart	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit ccd38024bc, which introduced the pg_signal_autovacuum_worker role, added a call to pgstat_get_beentry_by_proc_number() for the purpose of determining whether the process is an autovacuum worker. This function calls pgstat_read_current_status(), which can be fairly expensive and may return cached, out-of-date information. Since we just need to look up the target backend's BackendType, and we already know its ProcNumber, we can instead inspect the BackendStatusArray directly, which is much less expensive and possibly more up-to-date. There are some caveats with this approach (which are documented in the code), but it's still substantially better than before. Reported-by: Andres Freund Reviewed-by: Andres Freund Discussion: https://postgr.es/m/ujenaa2uabzfkwxwmfifawzdozh3ljr7geozlhftsuosgm7n7q%40g3utqqyyosb6
*	postmaster: Reduce verbosity of environment dump debug message	Andres Freund	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Emitting each variable separately is unnecessarily verbose / hard to skim over. Emit the whole thing in one ereport() to address that. Also remove program name and function reference from the message. The former doesn't seem particularly helpful and the latter is provided by the elog.c infrastructure these days. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/leouteo5ozcrux3fepuhtbp6c56tbfd4naxeokidbx7m75cabz@hhw6g4urlowt
*	Require ucrt if using MinGW.	Thomas Munro	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically we tolerated the absence of various C runtime library features for the benefit of the MinGW tool chain, because it used ancient msvcrt.dll for a long period of time. It now uses ucrt by default (like Windows 10+, Visual Studio 2015+), and that's the only configuration we're testing. In practice, we effectively required ucrt already in PostgreSQL 17, when commit 8d9a9f03 required _create_locale etc, first available in msvcr120.dll (Visual Studio 2013, the last of the pre-ucrt series of runtimes), and for MinGW users that practically meant ucrt because it was difficult or impossible to use msvcr120.dll. That may even not have been the first such case, but old MinGW configurations had already dropped off our testing radar so we weren't paying much attention. This commit formalizes the requirement. It also removes a couple of obsolete comments that discussed msvcrt.dll limitations, and some tests of !defined(_MSC_VER) to imply msvcrt.dll. There are many more anachronisms, but it'll take some time to figure out how to remove them all. APIs affected relate to locales, UTF-8, threads, large files and more. Thanks to Peter Eisentraut for the documentation change. It's not really necessary to talk about ucrt explicitly in such a short section, since it's the default for MinGW-w64 and MSYS2. It's enough to prune references and broken links to much older tools. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/d9e7731c-ca1b-477c-9298-fa51e135574a%40eisentraut.org
*	Improve slightly misleading internal error message	Peter Eisentraut	2024-11-27
\| \| \| \| \| \| \| \|	The error message was talking about RowCompareType but was actually checking strategy numbers. While those are closely related, it is better to be accurate. Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
*	Support LIKE with nondeterministic collations	Peter Eisentraut	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows for example using LIKE with case-insensitive collations. There was previously no internal implementation of this, so it was met with a not-supported error. This adds the internal implementation and removes the error. The implementation follows the specification of the SQL standard for this. Unlike with deterministic collations, the LIKE matching cannot go character by character but has to go substring by substring. For example, if we are matching against LIKE 'foo%bar', we can't start by looking for an 'f', then an 'o', but instead with have to find something that matches 'foo'. This is because the collation could consider substrings of different lengths to be equal. This is all internal to MatchText() in like_match.c. The changes in GenericMatchText() in like.c just pass through the locale information to MatchText(), which was previously not needed. This matches exactly Generic_Text_IC_like() below. ILIKE is not affected. (It's unclear whether ILIKE makes sense under nondeterministic collations.) This also updates match_pattern_prefix() in like_support.c to support optimizing the case of an exact pattern with nondeterministic collations. This was already alluded to in the previous code. (includes documentation examples from Daniel Vérité and test cases from Paul A Jungwirth) Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/700d2e86-bf75-4607-9cf2-f5b7802f6e88@eisentraut.org
*	Improve error message for replication of generated columns.	Amit Kapila	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, logical replication produces a generic error message when targeting a subscriber-side table column that is either missing or generated. The error message can be misleading for generated columns. This patch introduces a specific error message to clarify the issue when generated columns are involved. Author: Shubham Khanna Reviewed-by: Peter Smith, Vignesh C, Amit Kapila Discussion: https://postgr.es/m/CAHv8RjJBvYtqU7OAofBizOmQOK2Q8h+w9v2_cQWxT_gO7er3Aw@mail.gmail.com
*	Handle better implicit transaction state of pipeline mode	Michael Paquier	2024-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using a pipeline, a transaction starts from the first command and is committed with a Sync message or when the pipeline ends. Functions like IsInTransactionBlock() or PreventInTransactionBlock() were already able to understand a pipeline as being in a transaction block, but it was not the case of CheckTransactionBlock(). This function is called for example to generate a WARNING for SET LOCAL, complaining that it is used outside of a transaction block. The current state of the code caused multiple problems, like: - SET LOCAL executed at any stage of a pipeline issued a WARNING, even if the command was at least second in line where the pipeline is in a transaction state. - LOCK TABLE failed when invoked at any step of a pipeline, even if it should be able to work within a transaction block. The pipeline protocol assumes that the first command of a pipeline is not part of a transaction block, and that any follow-up commands is considered as within a transaction block. This commit changes the backend so as an implicit transaction block is started each time the first Execute message of a pipeline has finished processing, with this implicit transaction block ended once a sync is processed. The checks based on XACT_FLAGS_PIPELINING in the routines checking if we are in a transaction block are not necessary: it is enough to rely on the existing ones. Some tests are added to pgbench, that can be backpatched down to v17 when \syncpipeline is involved and down to v14 where \startpipeline and \endpipeline are available. This is unfortunately limited regarding the error patterns that can be checked, but it provides coverage for various pipeline combinations to check if these succeed or fail. These tests are able to capture the case of SET LOCAL's WARNING. The author has proposed a different feature to improve the coverage by adding similar meta-commands to psql where error messages could be checked, something more useful for the cases where commands cannot be used in transaction blocks, like REINDEX CONCURRENTLY or VACUUM. This is considered as future work for v18~. Author: Anthonin Bonnefoy Reviewed-by: Jelte Fennema-Nio, Michael Paquier Discussion: https://postgr.es/m/CAO6_XqrWO8uNBQrSu5r6jh+vTGi5Oiyk4y8yXDORdE2jbzw8xw@mail.gmail.com Backpatch-through: 13
*	Distinguish between AcquireExternalFD and epoll_create1 / kqueue failing	Andres Freund	2024-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The error messages in CreateWaitEventSet() made it hard to know whether the syscall or AcquireExternalFD() failed. This is particularly relevant because AcquireExternalFD() imposes a lower limit than what would cause syscalls fail with EMFILE. I did not change the message in libpqsrv_connect_prepare(), which is the one other use of AcquireExternalFD() in our codebase, as the error message already is less ambiguous. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/xjjx7r4xa7beixuu4qtkdhnwdbchrrpo3gaeb3jsbinvvdiat5@cwjw55mna5of
*	Clean up newlines following left parentheses	Álvaro Herrera	2024-11-26
\| \| \| \| \| \| \| \|	Most came in during the 17 cycle, so backpatch there. Some (particularly reorderbuffer.h) are very old, but backpatching doesn't seem useful. Like commits c9d297751959, c4f113e8fef9.
*	Improve InitShmemAccess() prototype	Peter Eisentraut	2024-11-26
\| \| \| \| \| \| \| \| \|	The code comment said, 'the argument should be declared "PGShmemHeader *seghdr", but we use void to avoid having to include ipc.h in shmem.h.' We can achieve the original goal with a struct forward declaration. (ipc.h was also not the correct header file.) Discussion: https://www.postgresql.org/message-id/flat/cnthxg2eekacrejyeonuhiaezc7vd7o2uowlsbenxqfkjwgvwj@qgzu6eoqrglb
*	Remove dead code in get_param_path_clause_serials()	Richard Guo	2024-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function get_param_path_clause_serials() is used to get the set of pushed-down clauses enforced within a parameterized Path. Since we don't currently support parameterized MergeAppend paths, and it doesn't look like that is going to change anytime soon (as explained in the comments for generate_orderedappend_paths), we don't need to consider MergeAppendPath in this function. This change won't make any measurable difference in performance; it's just for clarity's sake. Author: Richard Guo Reviewed-by: Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs4_Puie4DQ2ODvjQB_3CxYkUODnrJm8jn_ObMAcrjYNW7Q@mail.gmail.com
*	Reordering DISTINCT keys to match input path's pathkeys	Richard Guo	2024-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ordering of DISTINCT items is semantically insignificant, so we can reorder them as needed. In fact, in the parser, we absorb the sorting semantics of the sortClause as much as possible into the distinctClause, ensuring that one clause is a prefix of the other. This can help avoid a possible need to re-sort. In this commit, we attempt to adjust the DISTINCT keys to match the input path's pathkeys. This can likewise help avoid re-sorting, or allow us to use incremental-sort to save efforts. For DISTINCT ON expressions, the parser already ensures that they match the initial ORDER BY expressions. When reordering the DISTINCT keys, we must ensure that the resulting pathkey list matches the initial distinctClause pathkeys. This introduces a new GUC, enable_distinct_reordering, which allows the optimization to be disabled if needed. Author: Richard Guo Reviewed-by: Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48dR26cCcX0f=8bja2JKQPcU64136kHk=xekHT9xschiQ@mail.gmail.com