postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Rename parser token REF to REF_P to avoid a symbol conflict.	Tom Lane	2022-10-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the latest version of Apple's macOS SDK, <sys/socket.h> fails to compile if "REF" is #define'd as something. Apple may or may not agree that this is a bug, and even if they do accept the bug report I filed, they probably won't fix it very quickly. In the meantime, our back branches will all fail to compile gram.y. v15 and HEAD currently escape the problem thanks to the refactoring done in 98e93a1fc, but that's purely accidental. Moreover, since that patch removed a widely-visible inclusion of <netdb.h>, back-patching it seems too likely to break third-party code. Instead, change the token's code name to REF_P, following our usual convention for naming parser tokens that are likely to have symbol conflicts. The effects of that should be localized to the grammar and immediately surrounding files, so it seems like a safer answer. Per project policy that we want to keep recently-out-of-support branches buildable on modern systems, back-patch all the way to 9.2. Discussion: https://postgr.es/m/1803927.1665938411@sss.pgh.pa.us
*	Use libc's snprintf, not sprintf, for special cases in snprintf.c.	Tom Lane	2022-10-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	snprintf.c has always fallen back on libc's *printf implementation when printing pointers (%p) and floats. When this code originated, we were still supporting some platforms that lacked native snprintf, so we used sprintf for that. That's not actually unsafe in our usage, but nonetheless builds on macOS are starting to complain about sprintf being unconditionally deprecated; and I wouldn't be surprised if other platforms follow suit. There seems little reason to believe that any platform supporting C99 wouldn't have standards-compliant snprintf, so let's just use that instead to suppress such warnings. Back-patch to v12, which is where we started to require C99. It's also where we started to use our snprintf.c everywhere, so this wouldn't be enough to suppress the warning in older branches anyway --- that is, in older branches these aren't necessarily all our usages of libc's sprintf. It is enough in v12+ because any deprecation annotation attached to libc's sprintf won't apply to pg_sprintf. (Whether all our usages of pg_sprintf are adequately safe is not a matter I intend to address here, but perhaps it could do with some review.) Per report from Andres Freund and local testing. Discussion: https://postgr.es/m/20221015211955.q4cwbsfkyk3c4ty3@awork3.anarazel.de
*	Harden pmsignal.c against clobbered shared memory.	Tom Lane	2022-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The postmaster is not supposed to do anything that depends fundamentally on shared memory contents, because that creates the risk that a backend crash that trashes shared memory will take the postmaster down with it, preventing automatic recovery. In commit 969d7cd43 I lost sight of this principle and coded AssignPostmasterChildSlot() in such a way that it could fail or even crash if the shared PMSignalState structure became corrupted. Remarkably, we've not seen field reports of such crashes; but I managed to induce one while testing the recent changes around palloc chunk headers. To fix, make a semi-duplicative state array inside the postmaster so that we need consult only local state while choosing a "child slot" for a new backend. Ensure that other postmaster-executed routines in pmsignal.c don't have critical dependencies on the shared state, either. Corruption of PMSignalState might now lead ReleasePostmasterChildSlot() to conclude that backend X failed, when actually backend Y was the one that trashed things. But that doesn't matter, because we'll force a cluster-wide reset regardless. Back-patch to all supported branches, since this is an old bug. Discussion: https://postgr.es/m/3436789.1665187055@sss.pgh.pa.us
*	Yet further fixes for multi-row VALUES lists for updatable views.	Tom Lane	2022-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DEFAULT markers appearing in an INSERT on an updatable view could be mis-processed if they were in a multi-row VALUES clause. This would lead to strange errors such as "cache lookup failed for type NNNN", or in older branches even to crashes. The cause is that commit 41531e42d tried to re-use rewriteValuesRTE() to remove any SetToDefault nodes (that hadn't previously been replaced by the view's own default values) appearing in "product" queries, that is DO ALSO queries. That's fundamentally wrong because the DO ALSO queries might not even be INSERTs; and even if they are, their targetlists don't necessarily match the view's column list, so that almost all the logic in rewriteValuesRTE() is inapplicable. What we want is a narrow focus on replacing any such nodes with NULL constants. (That is, in this context we are interpreting the defaults as being strictly those of the view itself; and we already replaced any that aren't NULL.) We could add still more !force_nulls tests to further lobotomize rewriteValuesRTE(); but it seems cleaner to split out this case to a new function, restoring rewriteValuesRTE() to the charter it had before. Per bug #17633 from jiye_sw. Patch by me, but thanks to Richard Guo and Japin Li for initial investigation. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/17633-98cc85e1fa91e905@postgresql.org
*	Ensure all perl test modules are installed	Alvaro Herrera	2022-10-11
\| \| \| \| \| \| \| \| \| \| \| \|	PostgreSQL::Test::Cluster and ::Utils were not being installed. This is very hard to notice, as it only seems to affect external modules that want to run tests from 15 back in earlier versions. Oversight in b235d41d9646. This applies only to branches 14 and back, because 15 had already been made correct in commit b3b4d8e68ae8. Discussion: https://postgr.es/m/20221010093415.poplkyn7pjeiv2y7@alvherre.pgsql
*	Fix self-referencing foreign keys with partitioned tables	Alvaro Herrera	2022-10-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are a number of bugs in this area. Two of them are fixed here, namely: 1. get_relation_idx_constraint_oid does not restrict the type of constraint that's returned, so with sufficient bad luck it can return the OID of a foreign key constraint. This has the effect that a primary key in a partition can end up as a child of a foreign key, which makes no sense (it needs to be the child of the equivalent primary key.) Change the API contract so that only index-backed constraints are returned, mimicking get_constraint_index(). 2. Both CloneFkReferenced and CloneFkReferencing clone a self-referencing foreign key, so the partition ends up with a duplicate foreign key. Change the former function to ignore such constraints. Add some tests to verify that things are better now. (However, these new tests show some additional misbehavior that will be fixed later -- namely that there's a constraint marked NOT VALID.) Backpatch to 12, where these constraints are possible at all. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
*	Avoid improbable PANIC during heap_update, redux.	Tom Lane	2022-09-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 34f581c39 intended to ensure that RelationGetBufferForTuple would acquire a visibility-map page pin in case the otherBuffer's all-visible bit had become set since we last had lock on that page. But I missed a case: when we're extending the relation, VM concerns were dealt with only in the relatively-less-likely case that we fail to conditionally lock the otherBuffer. I think I'd believed that we couldn't need to worry about it if the conditional lock succeeds, which is true for the target buffer; but the otherBuffer was unlocked for awhile so its bit might be set anyway. So we need to do the GetVisibilityMapPins dance, and then also recheck the page's free space, in both cases. Per report from Jaime Casanova. Back-patch to v12 as the previous patch was (although there's still no evidence that the bug is reachable pre-v14). Discussion: https://postgr.es/m/E1lWLjP-00006Y-Ml@gemulon.postgresql.org
*	Change some errdetail() to errdetail_internal()	Alvaro Herrera	2022-09-28
\| \| \| \| \| \| \| \| \| \| \| \|	This prevents marking the argument string for translation for gettext, and it also prevents the given string (which is already translated) from being translated at runtime. Also, mark the strings used as arguments to check_rolespec_name for translation. Backpatch all the way back as appropriate. None of this is caught by any tests (necessarily so), so I verified it manually.
*	Fix tupdesc lifespan bug with AfterTriggersTableData.storeslot.	Tom Lane	2022-09-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 25936fd46 adjusted things so that the "storeslot" we use for remapping trigger tuples would have adequate lifespan, but it neglected to consider the lifespan of the tuple descriptor that the slot depends on. It turns out that in at least some cases, the tupdesc we are passing is a refcounted tupdesc, and the refcount for the slot's reference can get assigned to a resource owner having different lifespan than the slot does. That leads to an error like "tupdesc reference 0x7fdef236a1b8 is not owned by resource owner SubTransaction". Worse, because of a second oversight in the same commit, we'd try to free the same tupdesc refcount again while cleaning up after that error, leading to recursive errors and an "ERRORDATA_STACK_SIZE exceeded" PANIC. To fix the initial problem, let's just make a non-refcounted copy of the tupdesc we're supposed to use. That seems likely to guard against additional problems, since there's no strong reason for this code to assume that what it's given is a refcounted tupdesc; in which case there's an independent hazard of the tupdesc having shorter lifespan than the slot does. (I didn't bother trying to free said copy, since it should go away anyway when the (sub) transaction context is cleaned up.) The other issue can be fixed by making the code added to AfterTriggerFreeQuery work like the rest of that function, ie be sure that it doesn't try to free the same slot twice in the event of recursive error cleanup. While here, also clean up minor stylistic issues in the test case added by 25936fd46: don't use "create or replace function", as any name collision within the tests is likely to have ill effects that that won't mask; and don't use function names as generic as trigger_function1, especially if you're not going to drop them at the end of the test stanza. Per bug #17607 from Thomas Mc Kay. Back-patch to v12, as the previous fix was. Discussion: https://postgr.es/m/17607-bd8ccc81226f7f80@postgresql.org
*	Add missing source files to pg_waldump/nls.mk	Alvaro Herrera	2022-09-25
\|
*	Fix race condition where heap_delete() fails to pin VM page.	Jeff Davis	2022-09-22
\| \| \| \| \| \| \| \| \| \|	Similar to 5f12bc94dc, the code must re-check PageIsAllVisible() after buffer lock is re-acquired. Backpatching to the same version, 12. Discussion: https://postgr.es/m/CAEP4nAw9jYQDKd_5Y+-s2E4YiUJq1vqiikFjYGpLShtp-K3gag@mail.gmail.com Reported-by: Robins Tharakan Reviewed-by: Robins Tharakan Backpatch-through: 12
*	Fix thinko in comment.	Etsuro Fujita	2022-09-22
\| \| \| \| \| \| \|	This comment has been wrong since its introduction in commit 0d5f05cde; backpatch to v12 where that came in. Discussion: https://postgr.es/m/CAPmGK14VGf-xQjGQN4o1QyAbXAaxugU5%3DqfcmTDh1iufUDnV_w%40mail.gmail.com
*	Suppress more variable-set-but-not-used warnings from clang 15.	Tom Lane	2022-09-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mop up assorted set-but-not-used warnings in the back branches. This includes back-patching relevant fixes from commit 152c9f7b8 the rest of the way, but there are also several cases that did not appear in HEAD. Some of those we'd fixed in a retail way but not back-patched, and others I think just got rewritten out of existence during nearby refactoring. While here, also back-patch b1980f6d0 (PL/Tcl: Fix compiler warnings with Tcl 8.6) into 9.2, so that that branch compiles warning-free with modern Tcl. Per project policy, this is a candidate for back-patching into out-of-support branches: it suppresses annoying compiler warnings but changes no behavior. Hence, back-patch all the way to 9.2. Discussion: https://postgr.es/m/514615.1663615243@sss.pgh.pa.us
*	Suppress variable-set-but-not-used warnings from clang 15.	Tom Lane	2022-09-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	clang 15+ will issue a set-but-not-used warning when the only use of a variable is in autoincrements (e.g., "foo++;"). That's perfectly sensible, but it detects a few more cases that we'd not noticed before. Silence the warnings with our usual methods, such as PG_USED_FOR_ASSERTS_ONLY, or in one case by actually removing a useless variable. One thing that we can't nicely get rid of is that with %pure-parser, Bison emits "yynerrs" as a local variable that falls foul of this warning. To silence those, I inserted "(void) yynerrs;" in the top-level productions of affected grammars. Per recently-established project policy, this is a candidate for back-patching into out-of-support branches: it suppresses annoying compiler warnings but changes no behavior. Hence, back-patch to 9.5, which is as far as these patches go without issues. (A preliminary check shows that the prior branches need some other set-but-not-used cleanups too, so I'll leave them for another day.) Discussion: https://postgr.es/m/514615.1663615243@sss.pgh.pa.us
*	Future-proof the recursion inside ExecShutdownNode().	Tom Lane	2022-09-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The API contract for planstate_tree_walker() callbacks is that they take a PlanState pointer and a context pointer. Somebody figured they could save a couple lines of code by ignoring that, and passing ExecShutdownNode itself as the walker even though it has but one argument. Somewhat remarkably, we've gotten away with that so far. However, it seems clear that the upcoming C2x standard means to forbid such cases, and compilers that actively break such code likely won't be far behind. So spend the extra few lines of code to do it honestly with a separate walker function. In HEAD, we might as well go further and remove ExecShutdownNode's useless return value. I left that as-is in back branches though, to forestall complaints about ABI breakage. Back-patch, with the thought that this might become of practical importance before our stable branches are all out of service. It doesn't seem to be fixing any live bug on any currently known platform, however. Discussion: https://postgr.es/m/208054.1663534665@sss.pgh.pa.us
*	Make check_usermap() parameter names consistent.	Peter Geoghegan	2022-09-17
\| \| \| \| \| \| \| \| \| \| \|	The function has a bool argument named "case_insensitive", but that was spelled "case_sensitive" in the declaration. Make them consistent now to avoid confusion in the future. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com Backpatch: 10-
*	Include c.h instead of postgres.h in src/port/p{read,write}.c	Andres Freund	2022-09-17
\| \| \| \| \| \| \| \|	Frontend code shouldn't include postgres.h. Some files in src/port/ need to include postgres.h/postgres_fe.h, but these files don't. Discussion: https://postgr.es/m/20220915022626.5xx3ccgkzpkqw5mq@awork3.anarazel.de Backpatch: 12-, where 3fd2a7932ef introduced (some) of these files
*	Improve plpgsql's ability to handle arguments declared as RECORD.	Tom Lane	2022-09-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Treat arguments declared as RECORD as if that were a polymorphic type (which it is, sort of), in that we substitute the actual argument type while forming the function cache lookup key. This allows the specific composite type to be known in some cases where it was not before, at the cost of making a separate function cache entry for each named composite type that's passed to the function during a session. The particular symptom discussed in bug #17610 could be solved in other more-efficient ways, but only at the cost of considerable development work, and there are other cases where we'd still fail without this. Per bug #17610 from Martin Jurča. Back-patch to v11 where we first allowed plpgsql functions to be declared as taking type RECORD. Discussion: https://postgr.es/m/17610-fb1eef75bf6c2364@postgresql.org
*	Fix incorrect value for "strategy" with deflateParams() in walmethods.c	Michael Paquier	2022-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The zlib documentation mentions the values supported for the compression strategy, but this code has been using a hardcoded value of 0 rather than Z_DEFAULT_STRATEGY. This commit adjusts the code to use Z_DEFAULT_STRATEGY. Backpatch down to where this code has been added to ease the backport of any future patch touching this area. Reported-by: Tom Lane Discussion: https://postgr.es/m/1400032.1662217889@sss.pgh.pa.us Backpatch-through: 10
*	Expand palloc/pg_malloc API for more type safety	Peter Eisentraut	2022-09-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds additional variants of palloc, pg_malloc, etc. that encapsulate common usage patterns and provide more type safety. Specifically, this adds palloc_object(), palloc_array(), and repalloc_array(), which take the type name of the object to be allocated as its first argument and cast the return as a pointer to that type. There are also palloc0_object() and palloc0_array() variants for initializing with zero, and pg_malloc_*() variants of all of the above. Inspired by the talloc library. This is backpatched from master so that future backpatchable code can make use of these APIs. This patch by itself does not contain any users of these APIs. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/bb755632-2a43-d523-36f8-a1e7a389a907@enterprisedb.com
*	Fix NaN comparison in circle_same test	Daniel Gustafsson	2022-09-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit c4c340088 changed geometric operators to use float4 and float8 functions, and handle NaN's in a better way. The circle sameness test had a typo in the code which resulted in all comparisons with the left circle having a NaN radius considered same. postgres=# select '<(0,0),NaN>'::circle ~= '<(0,0),1>'::circle; ?column? ---------- t (1 row) This fixes the sameness test to consider the radius of both the left and right circle. Backpatch to v12 where this was introduced. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAo8dK=yctg2ZzjJuzV4zgOPBxRU5+Kb+yatFiddtQk6Rw@mail.gmail.com Backpatch-through: v12
*	Fix possible omission of variable storage markers in ECPG.	Tom Lane	2022-09-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ECPG preprocessor converted code such as static varchar str1[10], str2[20], str3[30]; into static struct varchar_1 { int len; char arr[ 10 ]; } str1 ; struct varchar_2 { int len; char arr[ 20 ]; } str2 ; struct varchar_3 { int len; char arr[ 30 ]; } str3 ; thus losing the storage attribute for the later variables. Repeat the declaration for each such variable. (Note that this occurred only for variables declared "varchar" or "bytea", which may help explain how it escaped detection for so long.) Andrey Sokolov Discussion: https://postgr.es/m/942241662288242@mail.yandex.ru
*	Choose FK name correctly during partition attachment	Alvaro Herrera	2022-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During ALTER TABLE ATTACH PARTITION, if the name of a parent's foreign key constraint is already used on the partition, the code tries to choose another one before the FK attributes list has been populated, so the resulting constraint name was "<relname>__fkey" instead of "<relname>_<attrs>_fkey". Repair, and add a test case. Backpatch to 12. In 11, the code to attach a partition was not smart enough to cope with conflicting constraint names, so the problem doesn't exist there. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20220901184156.738ebee5@karst
*	Further fixes for MULTIEXPR_SUBLINK fix.	Tom Lane	2022-09-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some more things I didn't think about in commits 3f7323cbb et al: MULTIEXPR_SUBLINK subplans might have been converted to initplans instead of regular subplans, in which case they won't show up in the modified targetlist. Fortunately, this would only happen if they have no input parameters, which means that the problem we originally needed to fix can't happen with them. Therefore, there's no need to clone their output parameters, and thus it doesn't hurt that we'll fail to see them in the first pass over the targetlist. Nonetheless, this complicates matters greatly, because now we have to distinguish output Params of initplans (which shouldn't get renumbered) from those of regular subplans (which should). This also breaks the simplistic scheme I used of assuming that the subplans found in the targetlist have consecutive subLinkIds. We really can't avoid the need to know the subplans' subLinkIds in this code. To fix that, add subLinkId as the last field of SubPlan. We can get away with that change in back branches because SubPlan nodes will never be stored in the catalogs, and there's no ABI break for external code that might be looking at the existing fields of SubPlan. Secondly, rewriteTargetListIU might have rolled up multiple FieldStores or SubscriptingRefs into one targetlist entry, breaking the assumption that there's at most one Param to fix per targetlist entry. (That assumption is OK I think in the ruleutils.c code I stole the logic from in 18f51083c, because that only deals with pre-rewrite query trees. But it's definitely not OK here.) Abandon that shortcut and just do a full tree walk on the targetlist to ensure we find all the Params we have to change. Per bug #17606 from Andre Lin. As before, only v10-v13 need the patch. Discussion: https://postgr.es/m/17606-e5c8ad18d31db96a@postgresql.org
*	Backpatch nbtree page deletion hardening.	Peter Geoghegan	2022-09-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Postgres 14 commit 5b861baa taught nbtree VACUUM to tolerate buggy opclasses. VACUUM's inability to locate a to-be-deleted page's downlink in the parent page was logged instead of throwing an error. VACUUM could just press on with vacuuming the index, and vacuuming the table as a whole. There are now anecdotal reports of this error causing problems that were much more disruptive than the underlying index corruption ever could be. Anything that makes VACUUM unable to make forward progress against one table/index ultimately risks making the system enter xidStopLimit mode. There is no good reason to take any chances here, so backpatch the hardening commit. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wzm9HR6Pow=t-iQa57zT8qmX6_M4h14F-pTtb=xFDW5FBA@mail.gmail.com Backpatch: 10-13 (all supported versions that lacked the hardening)
*	Fix oversight in recent MULTIEXPR_SUBLINK fix.	Tom Lane	2022-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commits 3f7323cbb et al missed the possibility that the Params they are looking for could be buried under implicit coercions, as well as other stuff that processIndirection() could add to the original targetlist entry. Copy the code in ruleutils.c that deals with such cases. (I thought about refactoring so that there's just one copy; but seeing that we only need this in old back branches, it seems not worth the trouble.) Per off-list report from Andre Lin. As before, only v10-v13 need the patch. Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
*	Fix some possibly latent bugs in slab.c	David Rowley	2022-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Primarily, this fixes an incorrect calculation in SlabCheck which was looking in the wrong byte for the sentinel check. The reason that we've never noticed this before in the form of a failing sentinel check is because the pre-check to this always fails because all current core users of slab contexts have a chunk size which is already MAXALIGNed, therefore there's never any space for the sentinel byte. It is possible that an extension needs to use a slab context and if they do with a chunk size that's not MAXALIGNed, then they'll likely get errors about overwritten sentinel bytes. Additionally, this patch changes various calculations which are being done based on the sizeof(SlabBlock). Currently, sizeof(SlabBlock) is a multiple of 8, therefore sizeof(SlabBlock) is the same as MAXALIGN(sizeof(SlabBlock)), however, if we were to ever have to add any fields to that struct as part of a bug fix, then SlabAlloc could end up returning a non-MAXALIGNed pointer. To be safe, let's ensure we always MAXALIGN sizeof(SlabBlock) before using it in any calculations. This patch has already been applied to master in d5ee4db0e. Diagnosed-by: Tomas Vondra, Tom Lane Author: Tomas Vondra, David Rowley Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com Backpatch-through: 10
*	Prevent long-term memory leakage in autovacuum launcher.	Tom Lane	2022-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	get_database_list() failed to restore the caller's memory context, instead leaving current context set to TopMemoryContext which is how CommitTransactionCommand() leaves it. The callers both think they are using short-lived contexts, for the express purpose of not having to worry about cleaning up individual allocations. The net effect therefore is that supposedly short-lived allocations could accumulate indefinitely in the launcher's TopMemoryContext. Although this has been broken for a long time, it seems we didn't have any obvious memory leak here until v15's rearrangement of the stats logic. I (tgl) am not entirely convinced that there's no other leak at all, though, and we're surely at risk of adding one in future back-patched fixes. So back-patch to all supported branches, even though this may be only a latent bug in pre-v15. Reid Thompson Discussion: https://postgr.es/m/972a4e12b68b0f96db514777a150ceef7dcd2e0f.camel@crunchydata.com
*	In the Snowball dictionary, don't try to stem excessively-long words.	Tom Lane	2022-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the input word exceeds 1000 bytes, don't pass it to the stemmer; just return it as-is after case folding. Such an input is surely not a word in any human language, so whatever the stemmer might do to it would be pretty dubious in the first place. Adding this restriction protects us against a known recursion-to-stack-overflow problem in the Turkish stemmer, and it seems like good insurance against any other safety or performance issues that may exist in the Snowball stemmers. (I note, for example, that they contain no CHECK_FOR_INTERRUPTS calls, so we really don't want them running for a long time.) The threshold of 1000 bytes is arbitrary. An alternative definition could have been to treat such words as stopwords, but that seems like a bigger break from the old behavior. Per report from Egor Chindyaskin and Alexander Lakhin. Thanks to Olly Betts for the recommendation to fix it this way. Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
*	On NetBSD, force dynamic symbol resolution at postmaster start.	Tom Lane	2022-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default of lazy symbol resolution means that when the postmaster first reaches the select() call in ServerLoop, it'll need to resolve the link to that libc entry point. NetBSD's dynamic loader takes an internal lock while doing that, and if a signal interrupts the operation then there is a risk of self-deadlock should the signal handler do anything that requires that lock, as several of the postmaster signal handlers do. The window for this is pretty narrow, and timing considerations make it unlikely that a signal would arrive right then anyway. But it's semi-repeatable on slow single-CPU machines, and in principle the race could happen with any hardware. The least messy solution to this is to force binding of dynamic symbols at postmaster start, using the "-z now" linker option. While we're at it, also use "-z relro" so as to provide a small security gain. It's not entirely clear whether any other platforms share this issue, but for now we'll assume it's NetBSD-specific. (We might later try to use "-z now" on more platforms for performance reasons, but that would not likely be something to back-patch.) Report and patch by me; the idea to fix it this way is from Andres Freund. Discussion: https://postgr.es/m/3384826.1661802235@sss.pgh.pa.us
*	Prevent WAL corruption after a standby promotion.	Robert Haas	2022-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a PostgreSQL instance performing archive recovery but not using standby mode is promoted, and the last WAL segment that it attempted to read ended in a partial record, the previous code would create invalid WAL on the new timeline. The WAL from the previously timeline would be copied to the new timeline up until the end of the last valid record, but instead of beginning to write WAL at immediately afterwards, the promoted server would write an overwrite contrecord at the beginning of the next segment. The end of the previous segment would be left as all-zeroes, resulting in failures if anything tried to read WAL from that file. The root of the issue is that ReadRecord() decides whether to set abortedRecPtr and missingContrecPtr based on the value of StandbyMode, but ReadRecord() switches to a new timeline based on the value of ArchiveRecoveryRequested. We shouldn't try to write an overwrite contrecord if we're switching to a new timeline, so change the test in ReadRecod() to check ArchiveRecoveryRequested instead. Code fix by Dilip Kumar. Comments by me incorporating suggested language from Álvaro Herrera. Further review from Kyotaro Horiguchi and Sami Imseih. Discussion: http://postgr.es/m/CAFiTN-t7umki=PK8dT1tcPV=mOUe2vNhHML6b3T7W7qqvvajjg@mail.gmail.com Discussion: http://postgr.es/m/FB0DEA0B-E14E-43A0-811F-C1AE93D00FF3%40amazon.com
*	Repair rare failure of MULTIEXPR_SUBLINK subplans in inherited updates.	Tom Lane	2022-08-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to v14, if we have a MULTIEXPR SubPlan (that is, use of the syntax UPDATE ... SET (c1, ...) = (SELECT ...)) in an UPDATE with an inherited or partitioned target table, inheritance_planner() will clone the targetlist and therefore also the MULTIEXPR SubPlan and the Param nodes referencing it for each child target table. Up to now, we've allowed all the clones to share the underlying subplan as well as the output parameter IDs -- that is, the runtime ParamExecData slots. That technique is borrowed from the far older code that supports initplans, and it works okay in that case because the cloned SubPlan nodes are essentially identical. So it doesn't matter which one of the clones the shared ParamExecData.execPlan field might point to. However, this fails to hold for MULTIEXPR SubPlans, because they can have nonempty "args" lists (values to be passed into the subplan), and those lists could get mutated to different states in the various clones. In the submitted reproducer, as well as the test case added here, one clone contains Vars with varno OUTER_VAR where another has INNER_VAR, because the child tables are respectively on the outer or inner side of the join. Sharing the execPlan pointer can result in trying to evaluate an args list that doesn't match the local execution state, with mayhem ensuing. The result often is to trigger consistency checks in the executor, but I believe this could end in a crash or incorrect updates. To fix, assign new Param IDs to each of the cloned SubPlans, so that they don't share ParamExecData slots at runtime. It still seems fine for the clones to share the underlying subplan, and extra ParamExecData slots are cheap enough that this fix shouldn't cost much. This has been busted since we invented MULTIEXPR SubPlans in 9.5. Probably the lack of previous reports is because query plans in which the different clones of a MULTIEXPR mutate to effectively-different states are pretty rare. There's no issue in v14 and later, because without inheritance_planner() there's never a reason to clone MULTIEXPR SubPlans. Per bug #17596 from Andre Lin. Patch v10-v13 only. Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
*	Fix typo in comment.	Etsuro Fujita	2022-08-26
\|
*	Defend against stack overrun in a few more places.	Tom Lane	2022-08-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c, and regex_selectivity_sub() in selectivity estimation could recurse until stack overflow; fix by adding check_stack_depth() calls. So could next() in the regex compiler, but that case is better fixed by converting its tail recursion to a loop. (We probably get better code that way too, since next() can now be inlined into its sole caller.) There remains a reachable stack overrun in the Turkish stemmer, but we'll need some advice from the Snowball people about how to fix that. Per report from Egor Chindyaskin and Alexander Lakhin. These mistakes are old, so back-patch to all supported branches. Richard Guo and Tom Lane Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
*	Doc: prefer sysctl to /proc/sys in docs and comments.	Tom Lane	2022-08-23
\| \| \| \| \| \| \| \| \|	sysctl is more portable than Linux's /proc/sys file tree, and often easier to use too. That's why most of our docs refer to sysctl when talking about how to adjust kernel parameters. Bring the few stragglers into line. Discussion: https://postgr.es/m/361175.1661187463@sss.pgh.pa.us
*	Add CHECK_FOR_INTERRUPTS while decoding changes.	Amit Kapila	2022-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \|	While decoding changes in a loop, if we skip all the changes there is no CFI making the loop uninterruptible. Reported-by: Whale Song and Andrey Borodin Bug: 17580 Author: Masahiko Sawada Reviwed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
*	Fix subtly-incorrect matching of parent and child partitioned indexes.	Tom Lane	2022-08-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When creating a partitioned index, DefineIndex tries to identify any existing indexes on the partitions that match the partitioned index, so that it can absorb those as child indexes instead of building new ones. Part of the matching is to compare IndexInfo structs --- but that wasn't done quite right. We're comparing the IndexInfo built within DefineIndex itself to one made from existing catalog contents by BuildIndexInfo. Notably, while BuildIndexInfo will run index expressions and predicates through expression preprocessing, that has not happened to DefineIndex's struct. The result is failure to match and subsequent creation of duplicate indexes. The easiest and most bulletproof fix is to build a new IndexInfo using BuildIndexInfo, thereby guaranteeing that the processing done is identical. While here, let's also extract the opfamily and collation data from the new partitioned index, removing ad-hoc logic that duplicated knowledge about how those are constructed. Per report from Christophe Pettus. Back-patch to v11 where we invented partitioned indexes. Richard Guo and Tom Lane Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
*	Fix replica identity check for a partitioned table.	Amit Kapila	2022-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current publisher code checks if UPDATE or DELETE can be executed with the replica identity of the table even if it's a partitioned table. We can skip checking the replica identity for partitioned tables because the operations are actually performed on the leaf partitions (not the partitioned table). Reported-by: Brad Nicholson Author: Hou Zhijie Reviewed-by: Peter Smith, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CAMMnM%3D8i5DohH%3DYKzV0_wYuYSYvuOJoL9F5nzXTc%2ByzsG1f6rg%40mail.gmail.com
*	Add missing bad-PGconn guards in libpq entry points.	Tom Lane	2022-08-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	There's a convention that externally-visible libpq functions should check for a NULL PGconn pointer, and fail gracefully instead of crashing. PQflush() and PQisnonblocking() didn't get that memo though. Also add a similar check to PQdefaultSSLKeyPassHook_OpenSSL; while it's not clear that ordinary usage could reach that with a null conn pointer, it's cheap enough to check, so let's be consistent. Daniele Varrazzo and Tom Lane Discussion: https://postgr.es/m/CA+mi_8Zm_mVVyW1iNFgyMd9Oh0Nv8-F+7Y3-BqwMgTMHuo_h2Q@mail.gmail.com
*	Fix outdated --help message for postgres -f	Michael Paquier	2022-08-15
\| \| \| \| \| \| \| \| \| \|	This option switch supports a total of 8 values, as told by set_plan_disabling_options() and the documentation, but this was not reflected in the output generated by --help. Author: Junwang Zhao Discussion: https://postgr.es/m/CAEG8a3+pT3cWzyjzKs184L1XMNm8NDnoJLiSjAYSO7XqpRh_vA@mail.gmail.com Backpatch-through: 10
*	Preserve memory context of VarStringSortSupport buffers.	Tom Lane	2022-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When enlarging the work buffers of a VarStringSortSupport object, varstrfastcmp_locale was careful to keep them in the ssup_cxt memory context; but varstr_abbrev_convert just used palloc(). The latter creates a hazard that the buffers could be freed out from under the VarStringSortSupport object, resulting in stomping on whatever gets allocated in that memory later. In practice, because we only use this code for ICU collations (cf. 3df9c374e), the problem is confined to use of ICU collations. I believe it may have been unreachable before the introduction of incremental sort, too, as traditional sorting usually just uses one context for the duration of the sort. We could fix this by making the broken stanzas in varstr_abbrev_convert match the non-broken ones in varstrfastcmp_locale. However, it seems like a better idea to dodge the issue altogether by replacing the pfree-and-allocate-anew coding with repalloc, which automatically preserves the chunk's memory context. This fix does add a few cycles because repalloc will copy the chunk's content, which the existing coding assumes is useless. However, we don't expect that these buffer enlargement operations are performance-critical. Besides that, it's far from obvious that copying the buffer contents isn't required, since these stanzas make no effort to mark the buffers invalid by resetting last_returned, cache_blob, etc. That seems to be safe upon examination, but it's fragile and could easily get broken in future, which wouldn't get revealed in testing with short-to-moderate-size strings. Per bug #17584 from James Inform. Whether or not the issue is reachable in the older branches, this code has been broken on its own terms from its introduction, so patch all the way back. Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org
*	Avoid misbehavior when hash_table_bytes < bucket_size.	Tom Lane	2022-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible to reach this case when work_mem is very small and tupsize is (relatively) very large. In that case ExecChooseHashTableSize would get an assertion failure, or with asserts off it'd compute nbuckets = 0, which'd likely cause misbehavior later (I've not checked). To fix, clamp the number of buckets to be at least 1. This is due to faulty conversion of old my_log2() coding in 28d936031. Back-patch to v13, as that was. Zhang Mingli Discussion: https://postgr.es/m/beb64ca0-91e2-44ac-bf4a-7ea36275ec02@Spark
*	Catch stack overflow when recursing in transformFromClauseItem().	Tom Lane	2022-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most parts of the parser can expect that the stack overflow check in transformExprRecurse() will trigger before things get desperate. However, transformFromClauseItem() can recurse directly to self without having analyzed any expressions, so it's possible to drive it to a stack-overrun crash. Add a check to prevent that. Per bug #17583 from Egor Chindyaskin. Back-patch to all supported branches. Richard Guo Discussion: https://postgr.es/m/17583-33be55b9f981f75c@postgresql.org
*	Add missing fields to _outConstraint()	Peter Eisentraut	2022-08-13
\| \| \| \| \| \| \| \|	As of 897795240cfaaed724af2f53ed2c50c9862f951f, check constraints can be declared invalid. But that patch didn't update _outConstraint() to also show the relevant struct fields (which were only applicable to foreign keys before that). This currently only affects debugging output, so no impact in practice.
*	pg_upgrade: Fix some minor code issues	Peter Eisentraut	2022-08-13
\| \| \| \| \| \| \| \| \| \| \| \|	96ef3b8ff1cf1950e897fd2f766d4bd9ef0d5d56 accidentally copied a not applicable comment from the float8_pass_by_value code to the data_checksums code. Remove that. 87d3b35a1ca31a9d947a8f919a6006679216dff0 changed pg_upgrade to checking the checksum version rather than just the Boolean presence of checksums, but didn't change the field type in its ControlData struct from bool. So this would not work correctly if there ever is a checksum version larger than 1.
*	Fix _outConstraint() for "identity" constraints	Peter Eisentraut	2022-08-12
\| \| \| \| \| \| \| \| \|	The set of fields printed by _outConstraint() in the CONSTR_IDENTITY case didn't match the set of fields actually used in that case. (The code was probably uncarefully copied from the CONSTR_DEFAULT case.) Fix that by using the right set of fields. Since there is no read support for this node type, this is really just for debugging output right now, so it doesn't affect anything important.
*	Back-Patch "Add wait_for_subscription_sync for TAP tests."	Amit Kapila	2022-08-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally done in commit 0c20dd33db for 16 only, to eliminate duplicate code and as an infrastructure that makes it easier to write future tests. However, it has been suggested that it would be good to back-patch this testing infrastructure to aid future tests in back-branches. Backpatch to all supported versions. Author: Masahiko Sawada Reviewed by: Amit Kapila, Shi yu Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com Discussion: https://postgr.es/m/E1oJBIf-0006sw-SA@gemulon.postgresql.org
*	Fix catalog lookup with the wrong snapshot during logical decoding.	Amit Kapila	2022-08-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
*	Fix handling of R/W expanded datums that are passed to SQL functions.	Tom Lane	2022-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
*	Stabilize output of new regression test.	Tom Lane	2022-08-08
\| \| \| \| \| \| \| \| \| \|	Per buildfarm, the output order of \dx+ isn't consistent across locales. Apply NO_LOCALE to force C locale. There might be a more localized way, but I'm not seeing it offhand, and anyway there is nothing in this test module that particularly cares about locales. Security: CVE-2022-2625