postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix misbehavior of EvalPlanQual checks with multiple result relations.	Tom Lane	2023-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit 86dc90056 made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of 3f7836ff6 and 4b3e37993, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where 86dc90056 came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
*	Avoid naming conflict between transactions.sql and namespace.sql.	Tom Lane	2023-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commits 681d9e462 et al added a test case in namespace.sql that implicitly relied on there not being a table "public.abc". However, the concurrently-run transactions.sql test creates precisely such a table, so with the right timing you'd get a failure. Creating a table named as generically as "abc" in a common schema seems like bad practice, so fix this by changing the name of transactions.sql's table. (Compare 2cf8c7aa4.) Marina Polyakova Discussion: https://postgr.es/m/80d0201636665d82185942e7112257b4@postgrespro.ru
*	Fix handling of empty ranges and NULLs in BRIN	Tomas Vondra	2023-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BRIN indexes did not properly distinguish between summaries for empty (no rows) and all-NULL ranges, treating them as essentially the same thing. Summaries were initialized with allnulls=true, and opclasses simply reset allnulls to false when processing the first non-NULL value. This however produces incorrect results if the range starts with a NULL value (or a sequence of NULL values), in which case we forget the range contains NULL values when adding the first non-NULL value. This happens because the allnulls flag is used for two separate purposes - to mark empty ranges (not representing any rows yet) and ranges containing only NULL values. Opclasses don't know which of these cases it is, and so don't know whether to set hasnulls=true. Setting the flag in both cases would make it correct, but it would also make BRIN indexes useless for queries with IS NULL clauses. All ranges start empty (and thus allnulls=true), so all ranges would end up with either allnulls=true or hasnulls=true. The severity of the issue is somewhat reduced by the fact that it only happens when adding values to an existing summary with allnulls=true. This can happen e.g. for small tables (because a summary for the first range exists for all BRIN indexes), or for tables with large fraction of NULL values in the indexed columns. Bulk summarization (e.g. during CREATE INDEX or automatic summarization) that processes all values at once is not affected by this issue. In this case the flags were updated in a slightly different way, not forgetting the NULL values. To identify empty ranges we use a new flag, stored in an unused bit in the BRIN tuple header so the on-disk format remains the same. A matching flag is added to BrinMemTuple, into a 3B gap after bt_placeholder. That means there's no risk of ABI breakage, although we don't actually pass the BrinMemTuple to any public API. We could also skip storing index tuples for empty summaries, but then we'd have to always process such ranges - even if there are no rows in large parts of the table (e.g. after a bulk DELETE), it would still require reading the pages etc. So we store them, but ignore them when building the bitmap. Backpatch to 11. The issue exists since BRIN indexes were introduced in 9.5, but older releases are already EOL. Backpatch-through: 11 Reviewed-by: Justin Pryzby, Matthias van de Meent, Alvaro Herrera Discussion: https://postgr.es/m/402430e4-7d9d-6cf1-09ef-464d80afff3b@enterprisedb.com
*	Fix handling of NULLs when merging BRIN summaries	Tomas Vondra	2023-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When merging BRIN summaries, union_tuples() did not correctly update the target hasnulls/allnulls flags. When merging all-NULL summary into a summary without any NULL values, the result had both flags set to false (instead of having hasnulls=true). This happened because the code only considered the hasnulls flags, ignoring the possibility the source summary has allnulls=true. Discovered while investigating issues with handling empty BRIN ranges and handling of NULL values, but it's a separate problem (has nothing to do with empty ranges). Fixed by considering both flags on the source summary, and updating the hasnulls flag on the target summary. Backpatch to 11. The bug exists since 9.5 (where BRIN indexes were introduced), but those releases are EOL already. Discussion: https://postgr.es/m/9d993d0d-e431-2196-9ccc-0554d0e60154%40enterprisedb.com
*	Stamp 12.15.REL_12_15	Tom Lane	2023-05-08
\|
*	Handle RLS dependencies in inlined set-returning functions properly.	Tom Lane	2023-05-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an SRF in the FROM clause references a table having row-level security policies, and we inline that SRF into the calling query, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Our thanks to Wolfgang Walther for reporting this problem. Stephen Frost and Tom Lane Security: CVE-2023-2455
*	Replace last PushOverrideSearchPath() call with set_config_option().	Noah Misch	2023-05-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The two methods don't cooperate, so set_config_option("search_path", ...) has been ineffective under non-empty overrideStack. This defect enabled an attacker having database-level CREATE privilege to execute arbitrary code as the bootstrap superuser. While that particular attack requires v13+ for the trusted extension attribute, other attacks are feasible in all supported versions. Standardize on the combination of NewGUCNestLevel() and set_config_option("search_path", ...). It is newer than PushOverrideSearchPath(), more-prevalent, and has no known disadvantages. The "override" mechanism remains for now, for compatibility with out-of-tree code. Users should update such code, which likely suffers from the same sort of vulnerability closed here. Back-patch to v11 (all supported versions). Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2023-2454
*	Translation updates	Peter Eisentraut	2023-05-08
\| \| \| \| \|	Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 438a2f5d29665ae0dd54f5ccd4f73f1360530c82
*	Fix prove_installcheck when used with PGXS	Peter Eisentraut	2023-05-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 153e215677 added the portlock directory. This is created in $ENV{top_builddir} if it is set. Under PGXS, top_builddir points into the installation directory, which is not necessarily writable and in any case inappropriate to use by a test suite. The cause of the problem is that the prove_installcheck target in Makefile.global exports top_builddir, which isn't useful (since no other Perl code actually reads it) and breaks this use case. The reason this code is there is probably that is has been dragged around with various other changes, in particular a0fc813266, but without a real purpose of its own. By just removing the exporting of top_builddir in prove_installcheck, the portlock directory then ends up under tmp_check in the build directory, which is more suitable. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/78d1cfa6-0065-865d-584b-cde6d8c18aff@enterprisedb.com
*	Move return statements out of PG_TRY blocks.	Nathan Bossart	2023-05-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we exit a PG_TRY block early via "continue", "break", "goto", or "return", we'll skip unwinding its exception stack. This change moves a couple of such "return" statements in PL/Python out of PG_TRY blocks. This was introduced in d0aa965c0a and affects all supported versions. We might also be able to add compile-time checks to prevent recurrence, but that is left as a future exercise. Reported-by: Mikhail Gribkov, Xing Guo Author: Xing Guo Reviewed-by: Michael Paquier, Andres Freund, Tom Lane Discussion: https://postgr.es/m/CAMEv5_v5Y%2B-D%3DCO1%2Bqoe16sAmgC4sbbQjz%2BUtcHmB6zcgS%2B5Ew%40mail.gmail.com Discussion: https://postgr.es/m/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com Backpatch-through: 11 (all supported versions)
*	In array_position()/array_positions(), beware of empty input array.	Tom Lane	2023-05-04
\| \| \| \| \| \| \| \| \| \| \| \|	These functions incautiously fetched the array's first lower bound even when the array is zero-dimensional, thus fetching the word after the allocated array space. While almost always harmless, with very bad luck this could result in SIGSEGV. Fix by adding an early exit for empty input. Per bug #17920 from Alexander Lakhin. Discussion: https://postgr.es/m/17920-f7c228c627b6d02e%40postgresql.org
*	Tighten array dimensionality checks in Python -> SQL array conversion.	Tom Lane	2023-05-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like plperl before f47004add, plpython wasn't being sufficiently careful about checking that list-of-list structures represent rectangular arrays, so that it would accept some cases in which different parts of the "array" are nested to different depths. This was exacerbated by Python's weak distinction between sequences and lists, so that in some cases strings could get treated as though they are lists (and burst into individual characters) even though a different ordering of the upper-level list would give a different result. Some of this behavior was unreachable (without risking a crash) before 81eaaf65e. It seems like a good idea to clean it all up in the same releases, rather than shipping a non-crashing but nonetheless visibly buggy behavior in the name of minimal change. Hence, back-patch. Per bug #17912 and further testing by Alexander Lakhin. Discussion: https://postgr.es/m/17912-82ceed78731d9cdc@postgresql.org
*	Tighten array dimensionality checks in Perl -> SQL array conversion.	Tom Lane	2023-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	plperl_array_to_datum() wasn't sufficiently careful about checking that nested lists represent a rectangular array structure; it would accept inputs such as "[1, []]". This is a bit related to the PL/Python bug fixed in commit 81eaaf65e, but it doesn't seem to provide any direct route to a memory stomp. Instead the likely failure mode is for makeMdArrayResult to be passed fewer Datums than the claimed array dimensionality requires, possibly leading to a wild pointer dereference and SIGSEGV. Per report from Alexander Lakhin. It's been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/5ebae5e4-d401-fadf-8585-ac3eaf53219c@gmail.com
*	Handle zero-length sublist correctly in Python -> SQL array conversion.	Tom Lane	2023-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If PLySequence_ToArray came across a zero-length sublist, it'd compute the overall array size as zero, possibly leading to a memory clobber. (This would likely qualify as a security bug, were it not that plpython is an untrusted language already.) I think there are other corner-case issues in this code as well, notably that the error messages don't match the core code and for some ranges of array sizes you'd get "invalid memory alloc request size" rather than the intended message about array size. Really this code has no business doing its own array size calculation at all, so remove the faulty code in favor of using ArrayGetNItems(). Per bug #17912 from Alexander Lakhin. Bug seems to have come in with commit 94aceed31, so back-patch to all supported branches. Discussion: https://postgr.es/m/17912-82ceed78731d9cdc@postgresql.org
*	Fix crashes with CREATE SCHEMA AUTHORIZATION and schema elements	Michael Paquier	2023-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CREATE SCHEMA AUTHORIZATION with appended schema elements can lead to crashes when comparing the schema name of the query with the schemas used in the qualification of some clauses in the elements' queries. The origin of the problem is that the transformation routine for the elements listed in a CREATE SCHEMA query uses as new, expected, schema name the one listed in CreateSchemaStmt itself. However, depending on the query, CreateSchemaStmt.schemaname may be NULL, being computed instead from the role specification of the query given by the AUTHORIZATION clause, that could be either: - A user name string, with the new schema name being set to the same value as the role given. - Guessed from CURRENT_ROLE, SESSION_ROLE or CURRENT_ROLE, with a new schema name computed from the security context where CREATE SCHEMA is running. Regression tests are added for CREATE SCHEMA with some appended elements (some of them with schema qualifications), covering also some role specification patterns. While on it, this simplifies the context structure used during the transformation of the elements listed in a CREATE SCHEMA query by removing the fields for the role specification and the role type. They were not used, and for the role specification this could be confusing as the schema name may by extracted from that at the beginning of CreateSchemaCommand(). This issue exists for a long time, so backpatch down to all the versions supported. Reported-by: Song Hongyu Author: Michael Paquier Reviewed-by: Richard Guo Discussion: https://postgr.es/m/17909-f65c12dfc5f0451d@postgresql.org Backpatch-through: 11
*	Fix vacuum_cost_delay check for balance calculation.	Daniel Gustafsson	2023-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 1021bd6a89 excluded autovacuum workers from cost-limit balance calculations when per-relation options were set. The code checks for limit and cost_delay being greater than zero, but since cost_delay can be set to -1 the test needs to check for greater than or zero. Backpatch to all supported branches since 1021bd6a89 was backpatched all the way at the time. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAD21AoBS7o6Ljt_vfqPQPf67AhzKu3fR0iqk8B=vVYczMugKMQ@mail.gmail.com Backpatch-through: v11 (all supported branches)
*	Fix memory leakage in plpgsql DO blocks that use cast expressions.	Tom Lane	2023-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 04fe805a1 modified plpgsql so that datatype casts make use of expressions cached by plancache.c, in place of older code where these expression trees were managed by plpgsql itself. However, I (tgl) forgot that we use a separate, shorter-lived cast info hashtable in DO blocks. The new mechanism thus resulted in session-lifespan leakage of the plancache data once a DO block containing one or more casts terminated. To fix, split the cast hash table into two parts, one that tracks only the plancache's CachedExpressions and one that tracks the expression state trees generated from them. DO blocks need their own expression state trees and hence their own version of the second hash table, but there's no reason they can't share the CachedExpressions with regular plpgsql functions. Per report from Ajit Awekar. Back-patch to v12 where the issue was introduced. Ajit Awekar and Tom Lane Discussion: https://postgr.es/m/CAHv6PyrNaqdvyWUspzd3txYQguFTBSnhx+m6tS06TnM+KWc_LQ@mail.gmail.com
*	Remove duplicate lines of code	Daniel Gustafsson	2023-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 6df7a9698bb accidentally included two identical prototypes for default_multirange_selectivi() and commit 086cf1458c6 added a break; statement where one was already present, thus duplicating it. While there is no bug caused by this, fix by removing the duplicated lines as they provide no value. Backpatch the fix for duplicate prototypes to v14 and the duplicate break statement fix to all supported branches to avoid backpatching hazards due to the removal. Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru> Discussion: https://postgr.es/m/0e69cb60-0176-f6d0-7e15-6478b7d85724@postgrespro.ru
*	Avoid character classification in regex escape parsing.	Jeff Davis	2023-04-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	For regex escape sequences, just test directly for the relevant ASCII characters rather than using locale-sensitive character classification. This fixes an assertion failure when a locale considers a non-ASCII character, such as "൧", to be a digit. Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs49Q6UoKGeT8pBkMtJGJd+16CBFZaaWUk9Du+2ERE5g_YA@mail.gmail.com Backpatch-through: 11
*	Update time zone data files to tzdata release 2023c.	Tom Lane	2023-04-18
\| \| \| \| \| \| \| \| \| \| \|	DST law changes in Egypt, Greenland, Morocco, and Palestine. When observing Moscow time, Europe/Kirov and Europe/Volgograd now use the abbreviations MSK/MSD instead of numeric abbreviations, for consistency with other timezones observing Moscow time. Also, America/Yellowknife is no longer distinct from America/Edmonton; this affects some pre-1948 timestamps in that area.
*	ecpg: Fix handling of strings in ORACLE compat code with SQLDA	Michael Paquier	2023-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When compiled with -C ORACLE, ecpg_get_data() had a one-off issue where it would incorrectly store the null terminator byte to str[-1] when varcharsize is 0, which is something that can happen when using SQLDA. This would eat 1 byte from the previous field stored, corrupting the results generated. All the callers of ecpg_get_data() estimate and allocate enough storage for the data received, and the fix of this commit relies on this assumption. Note that this maps to the case where no padding or truncation is required. This issue has been introduced by 3b7ab43 with the Oracle compatibility option, so backpatch down to v11. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20230410.173500.440060475837236886.horikyota.ntt@gmail.com Backpatch-through: 11
*	Avoid trying to write an empty WAL record in log_newpage_range().	Tom Lane	2023-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the last few pages in the specified range are empty (all zero), then log_newpage_range() could try to emit an empty WAL record containing no FPIs. This at least upsets an Assert in ReserveXLogInsertLocation, and might perhaps have bad real-world consequences in non-assert builds. This has been broken since log_newpage_range() was introduced, but the case was hard if not impossible to hit before commit 3d6a98457 decided it was okay to leave VM and FSM pages intentionally zero. Nonetheless, it seems prudent to back-patch. log_newpage_range() was added in v12 but later back-patched, so this affects all supported branches. Matthias van de Meent, per report from Justin Pryzby Discussion: https://postgr.es/m/ZD1daibg4RF50IOj@telsasoft.com
*	Fix assignment to array of domain over composite, redux.	Tom Lane	2023-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 3e310d837 taught isAssignmentIndirectionExpr() to look through CoerceToDomain nodes. That's not sufficient, because since commit 04fe805a1 it's been possible for the planner to simplify CoerceToDomain to RelabelType when the domain has no constraints to enforce. So we need to look through RelabelType too. Per bug #17897 from Alexander Lakhin. Although 3e310d837 was back-patched to v11, it seems sufficient to apply this change to v12 and later, since 04fe805a1 came in in v12. Dmitry Dolgov Discussion: https://postgr.es/m/17897-4216c546c3874044@postgresql.org
*	Fix incorrect partition pruning logic for boolean partitioned tables	David Rowley	2023-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The partition pruning logic assumed that "b IS NOT true" was exactly the same as "b IS FALSE". This is not the case when considering NULL values. Fix this so we correctly include any partition which could hold NULL values for the NOT case. Additionally, this fixes a bug in the partition pruning code which handles partitioned tables partitioned like ((NOT boolcol)). This is a seemingly unlikely schema design, and it was untested and also broken. Here we add tests for the ((NOT boolcol)) case and insert some actual data into those tables and verify we do get the correct rows back when running queries. I've also adjusted the existing boolpart tests to include some data and verify we get the correct results too. Both the bugs being fixed here could lead to incorrect query results with fewer rows being returned than expected. No additional rows could have been returned accidentally. In passing, remove needless ternary expression. It's more simple just to pass !is_not_clause to makeBoolConst(). It makes sense to do this so the code is consistent with the bug fix in the "else if" condition just below. David Kimura did submit a patch to fix the first of the issues here, but that's not what's being committed here. Reported-by: David Kimura Reviewed-by: Richard Guo, David Kimura Discussion: https://postgr.es/m/CAHnPFjQ5qxs6J_p+g8=ww7GQvfn71_JE+Tygj0S7RdRci1uwPw@mail.gmail.com Backpatch-through: 11, all supported versions
*	Fix parallel-safety marking when moving initplans to another node.	Tom Lane	2023-04-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our policy since commit ab77a5a45 has been that a plan node having any initplans is automatically not parallel-safe. (This could be relaxed, but not today.) clean_up_removed_plan_level neglected this, and could attach initplans to a parallel-safe child plan node without clearing the plan's parallel-safe flag. That could lead to "subplan was not initialized" errors at runtime, in case an initplan referenced another one and only the referencing one got transmitted to parallel workers. The fix in clean_up_removed_plan_level is trivial enough. materialize_finished_plan also moves initplans from one node to another, but it's okay because it already copies the source node's parallel_safe flag. The other place that does this kind of thing is standard_planner's hack to inject a top-level Gather when debug_parallel_query is active. But that's actually dead code given that we're correctly enforcing the "initplans aren't parallel safe" rule, so just replace it with an Assert that there are no initplans. Also improve some related comments. Normally we'd add a regression test case for this sort of bug. The mistake itself is already reached by existing tests, but there is accidentally no visible problem. The only known test case that creates an actual failure seems too indirect and fragile to justify keeping it as a regression test (not least because it fails to fail in v11, though the bug is clearly present there too). Per report from Justin Pryzby. Back-patch to all supported branches. Discussion: https://postgr.es/m/ZDVt6MaNWkRDO1LQ@telsasoft.com
*	For Kerberos testing, disable DNS lookups	Stephen Frost	2023-04-07
\| \| \| \| \| \| \| \| \|	Similar to 8dff2f224, this disables DNS lookups by the Kerberos library to look up the KDC and the realm while the Kerberos tests are running. In some environments, these lookups can take a long time and end up timing out and causing tests to fail. Further, since this isn't really our domain, we shouldn't be sending out these DNS requests during our tests.
*	For Kerberos testing, disable reverse DNS lookup	Stephen Frost	2023-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In our Kerberos test suite, there isn't much need to worry about the normal canonicalization that Kerberos provides by looking up the reverse DNS for the IP address connected to, and in some cases it can actively cause problems (eg: a captive portal wifi where the normally not resolvable localhost address used ends up being resolved anyway, and not to the domain we are using for testing, causing the entire regression test to fail with errors about not being able to get a TGT for the remote realm for cross-realm trust). Therefore, disable it by adding rdns = false into the krb5.conf that's generated for the test. Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/Y/QD2zDkDYQA1GQt@tamriel.snowman.net
*	Stabilize just-added regression test cases.	Tom Lane	2023-04-06
\| \| \| \| \| \| \| \| \| \| \|	The tests added by commits 029dea882 et al turn out to produce different output under -DRANDOMIZE_ALLOCATED_MEMORY. This is not a bug exactly: that flag causes coerce_type() to invoke the input function twice when coercing an unknown-type literal to a specific type. So you get tsqueryin's bleat about an empty tsquery twice. Revise the test query to avoid that. Discussion: https://postgr.es/m/20230406213813.uep7plg6lvcywujo@awork3.anarazel.de
*	Fix ts_headline() edge cases for empty query and empty search text.	Tom Lane	2023-04-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tsquery's GETQUERY() macro is only safe to apply to a tsquery that is known non-empty; otherwise it gives a pointer to garbage. Before commit 5a617d75d, ts_headline() avoided this pitfall, but only in a very indirect, nonobvious way. (hlCover could not reach its TS_execute call, because if the query contains no lexemes then hlFirstIndex would surely return -1.) After that commit, it fell into the trap, resulting in weird errors such as "unrecognized operator" and/or valgrind complaints. In HEAD, fix this by not calling TS_execute_locations() at all for an empty query. In the back branches, add a defensive check to hlCover() --- that's not fixing any live bug, but I judge the code a bit too fragile as-is. Also, both mark_hl_fragments() and mark_hl_words() were careless about the possibility of empty search text: in the cases where no match has been found, they'd end up telling mark_fragment() to mark from word indexes 0 to 0 inclusive, even when there is no word 0. This is harmless since we over-allocated the prs->words array, but it does annoy valgrind. Fix so that the end index is -1 and thus mark_fragment() will do nothing in such cases. Bottom line is that this fixes a live bug in HEAD, but in the back branches it's only getting rid of a valgrind nitpick. Back-patch anyway. Per report from Alexander Lakhin. Discussion: https://postgr.es/m/c27f642d-020b-01ff-ae61-086af287c4fd@gmail.com
*	Reject system columns as elements of foreign keys.	Tom Lane	2023-03-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Up through v11 it was sensible to use the "oid" system column as a foreign key column, but since that was removed there's no visible usefulness in making any of the remaining system columns a foreign key. Moreover, since the TupleTableSlot rewrites in v12, such cases actively fail because of implicit assumptions that only user columns appear in foreign keys. The lack of complaints about that seems like good evidence that no one is trying to do it. Hence, rather than trying to repair those assumptions (of which there are at least two, maybe more), let's just forbid the case up front. Per this patch, a system column in either the referenced or referencing side of a foreign key will draw this error; however, putting one in the referenced side would have failed later anyway, since we don't allow unique indexes to be made on system columns. Per bug #17877 from Alexander Lakhin. Back-patch to v12; the case still appears to work in v11, so we shouldn't break it there. Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
*	Ensure acquire_inherited_sample_rows sets its output parameters.	Tom Lane	2023-03-31
\| \| \| \| \| \| \| \| \| \| \| \|	The totalrows/totaldeadrows outputs were left uninitialized in cases where we find no analyzable child tables of a partitioned table. This could lead to setting the partitioned table's pg_class.reltuples value to garbage. It's not clear that that would have any very bad effects in practice, but fix it anyway because it's making valgrind unhappy. Reported and diagnosed by Alexander Lakhin (bug #17880). Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
*	Fix List memory issue in transformColumnDefinition	David Rowley	2023-03-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When calling generateSerialExtraStmts(), we would pass in the constraint->options. In some cases, generateSerialExtraStmts() would modify the referenced List to remove elements from it, but doing so is invalid without assigning the list back to all variables that point to it. In the particular reported problem case, the List became empty, in which cases it became NIL, but the passed in constraint->options didn't get to find out about that and was left pointing to free'd memory. To fix this, just perform a list_copy() inside generateSerialExtraStmts(). We could just do a list_copy() just before we perform the delete from the list, however, that seems less robust. Let's make sure the generated CreateSeqStmt gets a completely different copy of the list to be safe. Bug: #17879 Reported-by: Fei Changhong Diagnosed-by: Fei Changhong Discussion: https://postgr.es/m/17879-b7dfb5debee58ff5@postgresql.org Backpatch-through: 11, all supported versions
*	Fix dereference of dangling pointer in GiST index buffering build.	Tom Lane	2023-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gistBuildCallback tried to fetch the size of an index tuple that might have already been freed by gistProcessEmptyingQueue. While this seems to usually be harmless in production builds, in principle it could result in a SIGSEGV, or more likely a bogus value for indtuplesSize leading to poor page-split decisions later in the build. The memory management here is confusing and could stand to be refactored, but for the moment it seems to be enough to fetch the tuple size sooner. AFAICT the indtuples[Size] totals aren't used in between these places; even if they were, the updated values shouldn't be any worse to use. So just move the incrementing of the totals up. It's not very clear why our valgrind-using buildfarm animals haven't noticed this problem, because the relevant code path does seem to be exercised according to the code coverage report. I think the reason that we didn't fix this bug after the first report is that I'd wanted to try to understand that better. However, now that it's been re-discovered let's just be pragmatic and fix it already. Original report by Alexander Lakhin (bug #16329), later rediscovered by Egor Chindyaskin (bug #17874). Patch by Alexander Lakhin (commentary by Pavel Borisov and me). Back-patch to all supported branches. Discussion: https://postgr.es/m/16329-7a6aa9b6fa1118a1@postgresql.org Discussion: https://postgr.es/m/17874-63ca6c7ce42d2103@postgresql.org
*	Reject attempts to alter composite types used in indexes.	Tom Lane	2023-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	find_composite_type_dependencies() ignored indexes, which is a poor decision because an expression index could have a stored column of a composite (or other container) type even when the underlying table does not. Teach it to detect such cases and error out. We have to work a bit harder than for other relations because the pg_depend entry won't identify the specific index column of concern, but it's not much new code. This does not address bug #17872's original complaint that dropping a column in such a type might lead to violations of the uniqueness property that a unique index is supposed to ensure. That seems of much less concern to me because it won't lead to crashes. Per bug #17872 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
*	Fix oversights in array manipulation.	Tom Lane	2023-03-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
*	Ignore generated columns during apply of update/delete.	Amit Kapila	2023-03-23
\| \| \| \| \| \| \| \| \| \| \| \|	We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having generated columns. We didn't use to ignore generated columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci Reviewed-by: Shi yu, Amit Kapila Backpatch-through: 12 Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
*	Ignore dropped columns during apply of update/delete.	Amit Kapila	2023-03-21
\| \| \| \| \| \| \| \| \| \| \|	We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having dropped columns. We didn't use to ignore dropped columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci, Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
*	Fix race in parallel hash join batch cleanup, take II.	Thomas Munro	2023-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit 3b8981b6, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <thomas.munro@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: David Geier <geidav.pg@gmail.com> Tested-by: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
*	Fix pg_dump for hash partitioning on enum columns.	Tom Lane	2023-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hash partitioning on an enum is problematic because the hash codes are derived from the OIDs assigned to the enum values, which will almost certainly be different after a dump-and-reload than they were before. This means that some rows probably end up in different partitions than before, causing restore to fail because of partition constraint violations. (pg_upgrade dodges this problem by using hacks to force the enum values to keep the same OIDs, but that's not possible nor desirable for pg_dump.) Users can work around that by specifying --load-via-partition-root, but since that's a dump-time not restore-time decision, one might find out the need for it far too late. Instead, teach pg_dump to apply that option automatically when dealing with a partitioned table that has hash-on-enum partitioning. Also deal with a pre-existing issue for --load-via-partition-root mode: in a parallel restore, we try to TRUNCATE target tables just before loading them, in order to enable some backend optimizations. This is bad when using --load-via-partition-root because (a) we're likely to suffer deadlocks from restore jobs trying to restore rows into other partitions than they came from, and (b) if we miss getting a deadlock we might still lose data due to a TRUNCATE removing rows from some already-completed restore job. The fix for this is conceptually simple: just don't TRUNCATE if we're dealing with a --load-via-partition-root case. The tricky bit is for pg_restore to identify those cases. In dumps using COPY commands we can inspect each COPY command to see if it targets the nominal target table or some ancestor. However, in dumps using INSERT commands it's pretty impractical to examine the INSERTs in advance. To provide a solution for that going forward, modify pg_dump to mark TABLE DATA items that are using --load-via-partition-root with a comment. (This change also responds to a complaint from Robert Haas that the dump output for --load-via-partition-root is pretty confusing.) pg_restore checks for the special comment as well as checking the COPY command if present. This will fail to identify the combination of --load-via-partition-root and --inserts in pre-existing dump files, but that should be a pretty rare case in the field. If it does happen you will probably get a deadlock failure that you can work around by not using parallel restore, which is the same as before this bug fix. Having done this, there seems no remaining reason for the alarmism in the pg_dump man page about combining --load-via-partition-root with parallel restore, so remove that warning. Patch by me; thanks to Julien Rouhaud for review. Back-patch to v11 where hash partitioning was introduced. Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
*	tests: Prevent syslog activity by slapd, take 2	Andres Freund	2023-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately it turns out that the logfile-only option added in b9f8d1cbad7 is only available in openldap starting in 2.6. Luckily the option to control the log level (loglevel/-s) have been around for much longer. As it turns out loglevel/-s only control what goes into syslog, not what ends up in the file specified with 'logfile' and stderr. While we currently are specifying 'logfile', nothing ends up in it, as the option only controls debug messages, and we didn't set a debug level. The debug level can only be configured on the commandline and also prevents forking. That'd require larger changes, so this commit doesn't tackle that issue. Specify the syslog level when starting slapd using -s, as that allows to prevent all syslog messages if one uses '0' instead of 'none', while loglevel doesn't prevent the first message. Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de Backpatch: 11-
*	tests: Minimize syslog activity by slapd	Andres Freund	2023-03-16
\| \| \| \| \| \| \| \| \| \|	Until now the tests using slapd spammed syslog for every connection / query. Use logfile-only to prevent syslog activity. Unfortunately that only takes effect after logging the first message, but that's still much better than the prior situation. Discussion: https://postgr.es/m/20230311233708.3yjdbjkly2q4gq2j@awork3.anarazel.de Backpatch: 11-
*	Small tidyup for commit d41a178b, part II.	Thomas Munro	2023-03-17
\| \| \| \| \| \| \| \| \| \| \|	Further to commit 6a9229da, checking for NULL is now redundant. An "out of memory" error would have been thrown already by palloc() and treated as FATAL, so we can delete a few more lines. Back-patch to all releases, like those other commits. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4040668.1679013388%40sss.pgh.pa.us
*	Work around spurious compiler warning in inet operators	Andres Freund	2023-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gcc 12+ has complaints like the following: ../../../../../pgsql/src/backend/utils/adt/network.c: In function 'inetnot': ../../../../../pgsql/src/backend/utils/adt/network.c:1893:34: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=] 1893 \| pdst[nb] = ~pip[nb]; \| ~~~~~~~~~^~~~~~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 27 \| unsigned char ipaddr[16]; /* up to 128 bits of address */ \| ^~~~~~ ../../../../../pgsql/src/include/utils/inet.h:27:23: note: at offset -1 into destination object 'ipaddr' of size 16 This is due to a compiler bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104986 It has been a year since the bug has been reported without getting fixed. As the warnings are verbose and use of gcc 12 is becoming more common, it seems worth working around the bug. Particularly because a simple reformulation of the loop condition fixes the issue and isn't any less readable. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/144536.1648326206@sss.pgh.pa.us Backpatch: 11-
*	Small tidyup for commit d41a178b.	Thomas Munro	2023-03-17
\| \| \| \| \| \| \| \| \| \| \|	A comment was left behind claiming that we needed to use malloc() rather than palloc() because the corresponding free would run in another thread, but that's not true anymore. Remove that comment. And, with the reason being gone, we might as well actually use palloc(). Back-patch to supported releases, like d41a178b. Discussion: https://postgr.es/m/CA%2BhUKG%2BpdM9v3Jv4tc2BFx2jh_daY3uzUyAGBhtDkotEQDNPYw%40mail.gmail.com
*	Fix waitpid() emulation on Windows.	Thomas Munro	2023-03-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our waitpid() emulation didn't prevent a PID from being recycled by the OS before the call to waitpid(). The postmaster could finish up tracking more than one child process with the same PID, and confuse them. Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(), so that resources are released synchronously. The process and PID continue to exist until we close the process handle, which only happens once we're ready to adjust our book-keeping of running children. This seems to explain a couple of failures on CI. It had never been reported before, despite the code being as old as the Windows port. Perhaps Windows started recycling PIDs more rapidly, or perhaps timing changes due to commit 7389aad6 made it more likely to break. Thanks to Alexander Lakhin for analysis and Andres Freund for tracking down the root cause. Back-patch to all supported branches. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de
*	Fix corner case bug in numeric to_char() some more.	Tom Lane	2023-03-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The band-aid applied in commit f0bedf3e4 turns out to still need some work: it made sure we didn't set Np->last_relevant too small (to the left of the decimal point), but it didn't prevent setting it too large (off the end of the partially-converted string). This could result in fetching data beyond the end of the allocated space, which with very bad luck could cause a SIGSEGV, though I don't see any hazard of interesting memory disclosure. Per bug #17839 from Thiago Nunes. The bug's pretty ancient, so back-patch to all supported versions. Discussion: https://postgr.es/m/17839-aada50db24d7b0da@postgresql.org
*	Fix JSON error reporting for many cases of erroneous string values.	Tom Lane	2023-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The majority of error exit cases in json_lex_string() failed to set lex->token_terminator, causing problems for the error context reporting code: it would see token_terminator less than token_start and do something more or less nuts. In v14 and up the end result could be as bad as a crash in report_json_context(). Older versions accidentally avoided that fate; but all versions produce error context lines that are far less useful than intended, because they'd stop at the end of the prior token instead of continuing to where the actually-bad input is. To fix, invent some macros that make it less notationally painful to do the right thing. Also add documentation about what the function is actually required to do; and in >= v14, add an assertion in report_json_context about token_terminator being sufficiently far advanced. Per report from Nikolay Shaplov. Back-patch to all supported versions. Discussion: https://postgr.es/m/7332649.x5DLKWyVIX@thinkpad-pgpro
*	Fix failure to detect some cases of improperly-nested aggregates.	Tom Lane	2023-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	check_agg_arguments_walker() supposed that it needn't descend into the arguments of a lower-level aggregate function, but this is just wrong in the presence of multiple levels of sub-select. The oversight would lead to executor failures on queries that should be rejected. (Prior to v11, they actually were rejected, thanks to a "redundant" execution-time check.) Per bug #17835 from Anban Company. Back-patch to all supported branches. Discussion: https://postgr.es/m/17835-4f29f3098b2d0ba4@postgresql.org
*	Fix inconsistent error handling for GSS encryption in PQconnectPoll()	Michael Paquier	2023-03-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The error cases for TLS and GSS encryption were inconsistent. After TLS fails, the connection is marked as dead and follow-up calls of PQconnectPoll() would return immediately, but GSS encryption was not doing that, so the connection would still have been allowed to enter the GSS handling code. This was handled incorrectly when gssencmode was set to "require". "prefer" was working correctly, and this could not happen under "disable" as GSS encryption would not be attempted. This commit makes the error handling of GSS encryption on par with TLS portion, fixing the case of gssencmode=require. Reported-by: Jacob Champion Author: Michael Paquier Reviewed-by: Jacob Champion, Stephen Frost Discussion: https://postgr.es/m/23787477-5fe1-a161-6d2a-e459f74c4713@timescale.com Backpatch-through: 12
*	Mark unsafe_tests module as not runnable with installcheck	Andrew Dunstan	2023-03-12
\| \| \| \| \| \| \| \| \| \| \|	This was an omission in the original creation of the module. Also slightly adjust some wording to avoid a double "is". Backpatch the non-meson piece of this to release 12, where the module was introduced. Discussion: https://postgr.es/m/be869e1c-8e3f-4cde-8609-212c899cccf9@dunslane.net