postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Add CHECK_FOR_INTERRUPTS while decoding changes.	Amit Kapila	2022-08-23
\| \| \| \| \| \| \| \| \| \| \| \| \|	While decoding changes in a loop, if we skip all the changes there is no CFI making the loop uninterruptible. Reported-by: Whale Song and Andrey Borodin Bug: 17580 Author: Masahiko Sawada Reviwed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru
*	Fix subtly-incorrect matching of parent and child partitioned indexes.	Tom Lane	2022-08-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When creating a partitioned index, DefineIndex tries to identify any existing indexes on the partitions that match the partitioned index, so that it can absorb those as child indexes instead of building new ones. Part of the matching is to compare IndexInfo structs --- but that wasn't done quite right. We're comparing the IndexInfo built within DefineIndex itself to one made from existing catalog contents by BuildIndexInfo. Notably, while BuildIndexInfo will run index expressions and predicates through expression preprocessing, that has not happened to DefineIndex's struct. The result is failure to match and subsequent creation of duplicate indexes. The easiest and most bulletproof fix is to build a new IndexInfo using BuildIndexInfo, thereby guaranteeing that the processing done is identical. While here, let's also extract the opfamily and collation data from the new partitioned index, removing ad-hoc logic that duplicated knowledge about how those are constructed. Per report from Christophe Pettus. Back-patch to v11 where we invented partitioned indexes. Richard Guo and Tom Lane Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
*	Add missing bad-PGconn guards in libpq entry points.	Tom Lane	2022-08-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	There's a convention that externally-visible libpq functions should check for a NULL PGconn pointer, and fail gracefully instead of crashing. PQflush() and PQisnonblocking() didn't get that memo though. Also add a similar check to PQdefaultSSLKeyPassHook_OpenSSL; while it's not clear that ordinary usage could reach that with a null conn pointer, it's cheap enough to check, so let's be consistent. Daniele Varrazzo and Tom Lane Discussion: https://postgr.es/m/CA+mi_8Zm_mVVyW1iNFgyMd9Oh0Nv8-F+7Y3-BqwMgTMHuo_h2Q@mail.gmail.com
*	Fix outdated --help message for postgres -f	Michael Paquier	2022-08-15
\| \| \| \| \| \| \| \| \| \|	This option switch supports a total of 8 values, as told by set_plan_disabling_options() and the documentation, but this was not reflected in the output generated by --help. Author: Junwang Zhao Discussion: https://postgr.es/m/CAEG8a3+pT3cWzyjzKs184L1XMNm8NDnoJLiSjAYSO7XqpRh_vA@mail.gmail.com Backpatch-through: 10
*	Preserve memory context of VarStringSortSupport buffers.	Tom Lane	2022-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When enlarging the work buffers of a VarStringSortSupport object, varstrfastcmp_locale was careful to keep them in the ssup_cxt memory context; but varstr_abbrev_convert just used palloc(). The latter creates a hazard that the buffers could be freed out from under the VarStringSortSupport object, resulting in stomping on whatever gets allocated in that memory later. In practice, because we only use this code for ICU collations (cf. 3df9c374e), the problem is confined to use of ICU collations. I believe it may have been unreachable before the introduction of incremental sort, too, as traditional sorting usually just uses one context for the duration of the sort. We could fix this by making the broken stanzas in varstr_abbrev_convert match the non-broken ones in varstrfastcmp_locale. However, it seems like a better idea to dodge the issue altogether by replacing the pfree-and-allocate-anew coding with repalloc, which automatically preserves the chunk's memory context. This fix does add a few cycles because repalloc will copy the chunk's content, which the existing coding assumes is useless. However, we don't expect that these buffer enlargement operations are performance-critical. Besides that, it's far from obvious that copying the buffer contents isn't required, since these stanzas make no effort to mark the buffers invalid by resetting last_returned, cache_blob, etc. That seems to be safe upon examination, but it's fragile and could easily get broken in future, which wouldn't get revealed in testing with short-to-moderate-size strings. Per bug #17584 from James Inform. Whether or not the issue is reachable in the older branches, this code has been broken on its own terms from its introduction, so patch all the way back. Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org
*	Catch stack overflow when recursing in transformFromClauseItem().	Tom Lane	2022-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most parts of the parser can expect that the stack overflow check in transformExprRecurse() will trigger before things get desperate. However, transformFromClauseItem() can recurse directly to self without having analyzed any expressions, so it's possible to drive it to a stack-overrun crash. Add a check to prevent that. Per bug #17583 from Egor Chindyaskin. Back-patch to all supported branches. Richard Guo Discussion: https://postgr.es/m/17583-33be55b9f981f75c@postgresql.org
*	Add missing fields to _outConstraint()	Peter Eisentraut	2022-08-13
\| \| \| \| \| \| \| \|	As of 897795240cfaaed724af2f53ed2c50c9862f951f, check constraints can be declared invalid. But that patch didn't update _outConstraint() to also show the relevant struct fields (which were only applicable to foreign keys before that). This currently only affects debugging output, so no impact in practice.
*	pg_upgrade: Fix some minor code issues	Peter Eisentraut	2022-08-13
\| \| \| \| \| \| \| \| \| \| \| \|	96ef3b8ff1cf1950e897fd2f766d4bd9ef0d5d56 accidentally copied a not applicable comment from the float8_pass_by_value code to the data_checksums code. Remove that. 87d3b35a1ca31a9d947a8f919a6006679216dff0 changed pg_upgrade to checking the checksum version rather than just the Boolean presence of checksums, but didn't change the field type in its ControlData struct from bool. So this would not work correctly if there ever is a checksum version larger than 1.
*	Fix _outConstraint() for "identity" constraints	Peter Eisentraut	2022-08-12
\| \| \| \| \| \| \| \| \|	The set of fields printed by _outConstraint() in the CONSTR_IDENTITY case didn't match the set of fields actually used in that case. (The code was probably uncarefully copied from the CONSTR_DEFAULT case.) Fix that by using the right set of fields. Since there is no read support for this node type, this is really just for debugging output right now, so it doesn't affect anything important.
*	Back-Patch "Add wait_for_subscription_sync for TAP tests."	Amit Kapila	2022-08-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally done in commit 0c20dd33db for 16 only, to eliminate duplicate code and as an infrastructure that makes it easier to write future tests. However, it has been suggested that it would be good to back-patch this testing infrastructure to aid future tests in back-branches. Backpatch to all supported versions. Author: Masahiko Sawada Reviewed by: Amit Kapila, Shi yu Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com Discussion: https://postgr.es/m/E1oJBIf-0006sw-SA@gemulon.postgresql.org
*	Fix catalog lookup with the wrong snapshot during logical decoding.	Amit Kapila	2022-08-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
*	Fix handling of R/W expanded datums that are passed to SQL functions.	Tom Lane	2022-08-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
*	Stamp 12.12.REL_12_12	Tom Lane	2022-08-08
\|
*	Stabilize output of new regression test.	Tom Lane	2022-08-08
\| \| \| \| \| \| \| \| \| \|	Per buildfarm, the output order of \dx+ isn't consistent across locales. Apply NO_LOCALE to force C locale. There might be a more localized way, but I'm not seeing it offhand, and anyway there is nothing in this test module that particularly cares about locales. Security: CVE-2022-2625
*	In extensions, don't replace objects not belonging to the extension.	Tom Lane	2022-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625
*	Translation updates	Alvaro Herrera	2022-08-08
\| \| \| \| \|	Source-Git-URL: ssh://git@git.postgresql.org/pgtranslation/messages.git Source-Git-Hash: efdf4e068bcb504ef277413196f978621726bda5
*	Remove unportable use of timezone in recent test	Alvaro Herrera	2022-08-07
\| \| \| \| \| \|	Per buildfarm member snapper Discussion: https://postgr.es/m/129951.1659812518@sss.pgh.pa.us
*	Improve recently-added test reliability	Alvaro Herrera	2022-08-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 59be1c942a47 already tried to make src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't enough. Try to improve on that by making this use of a replication slot to be more like others. Also, don't drop the slot. Make a few other stylistic changes while at it. It's still quite slow, which is another thing that we need to fix in this script. Backpatch to all supported branches. Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us
*	BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checking	Alvaro Herrera	2022-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That bit is unlogged and therefore it's wrong to consider it in WAL page comparison. Add a test that tickles the case, as branch testing technology allows. This has been a problem ever since wal consistency checking was introduced (commit a507b86900f6 for pg10), so backpatch to all supported branches. Author: 王海洋 (Haiyang Wang) <wanghaiyang.001@bytedance.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com
*	Add HINT for restartpoint race with KeepFileRestoredFromArchive().	Noah Misch	2022-08-05
\| \| \| \| \| \| \| \| \| \|	The five commits ending at cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 closed this race condition for v15+. For v14 through v10, add a HINT to discourage studying the cosmetic problem. Reviewed by Kyotaro Horiguchi and David Steele. Discussion: https://postgr.es/m/20220731061747.GA3692882@rfd.leadboat.com
*	regress: fix test instability	Alvaro Herrera	2022-08-05
\| \| \| \| \| \| \|	Having additional triggers in a test table made the ORDER BY clauses in old queries underspecified. Add another column there for stability. Per sporadic buildfarm pink.
*	Fix ENABLE/DISABLE TRIGGER to handle recursion correctly	Alvaro Herrera	2022-08-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db9b did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that 86f575948c77 had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of 86f575948c77 that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like bbb927b4db9b. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
*	Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.	Tom Lane	2022-08-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ordinarily the functions called in this loop ought to have plenty of CFIs themselves; but we've now seen a case where no such CFI is reached, making the loop uninterruptible. Even though that's from a recently-introduced bug, it seems prudent to install a CFI at the loop level in all branches. Per discussion of bug #17558 from Andrew Kesper (an actual fix for that bug will follow). Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
*	Reduce test runtime of src/test/modules/snapshot_too_old.	Tom Lane	2022-08-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sto_using_cursor and sto_using_select tests were coded to exercise every permutation of their test steps, but AFAICS there is no value in exercising more than one. This matters because each permutation costs about six seconds, thanks to the "pg_sleep(6)". Perhaps we could reduce that, but the useless permutations seem worth getting rid of in any case. (Note that sto_using_hash_index got it right already.) While here, clean up some other sloppiness such as an unused table. This doesn't make too much difference in interactive testing, since the wasted time is typically masked by parallelization with other tests. However, the buildfarm runs this as a serial step, which means we can expect to shave ~40 seconds from every buildfarm run. That makes it worth back-patching. Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us
*	Check maximum number of columns in function RTEs, too.	Tom Lane	2022-08-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I thought commit fd96d14d9 had plugged all the holes of this sort, but no, function RTEs could produce oversize tuples too, either via long coldeflists or just from multiple functions in one RTE. (I'm pretty sure the other variants of base RTEs aren't a problem, because they ultimately refer to either a table or a sub-SELECT, whose widths are enforced elsewhere. But we explicitly allow join RTEs to be overwidth, as long as you don't try to form their tuple result.) Per further discussion of bug #17561. As before, patch all branches. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
*	Fix error reporting after ioctl() call with pg_upgrade --clone	Michael Paquier	2022-08-01
\| \| \| \| \| \| \| \| \| \| \| \|	errno was not reported correctly after attempting to clone a file, leading to incorrect error reports. While scanning through the code, I have not noticed any similar mistakes. Error introduced in 3a769d8. Author: Justin Pryzby Discussion: https://postgr.es/m/20220731134135.GY15006@telsasoft.com Backpatch-through: 12
*	Fix new recovery test for log_error_verbosity=verbose case	Andrew Dunstan	2022-07-29
\| \| \| \| \| \| \| \| \|	The new test is from commit 9e4f914b5e. With this setting messages have SQL error numbers included, so that needs to be provided for in the pattern looked for. Backpatch to all live branches like the original.
*	In transformRowExpr(), check for too many columns in the row.	Tom Lane	2022-07-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A RowExpr with more than MaxTupleAttributeNumber columns would fail at execution anyway, since we cannot form a tuple datum with more than that many columns. While heap_form_tuple() has a check for too many columns, it emerges that there are some intermediate bits of code that don't check and can be driven to failure with sufficiently many columns. Checking this at parse time seems like the most appropriate place to install a defense, since we already check SELECT list length there. While at it, make the SELECT-list-length error use the same errcode (TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic PROGRAM_LIMIT_EXCEEDED. Per bug #17561 from Egor Chindyaskin. The given test case crashes in all supported branches (and probably a lot further back), so patch all. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
*	Fix test instability	Alvaro Herrera	2022-07-29
\| \| \| \| \| \| \| \| \| \|	On FreeBSD, the new test fails due to a WAL file being removed before the standby has had the chance to copy it. Fix by adding a replication slot to prevent the removal until after the standby has connected. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com
*	Fix replay of create database records on standby	Alvaro Herrera	2022-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Crash recovery on standby may encounter missing directories when replaying database-creation WAL records. Prior to this patch, the standby would fail to recover in such a case; however, the directories could be legitimately missing. Consider the following sequence of commands: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, crash recovery must be able to continue. A fix for this problem was already attempted in 49d9cfc68bf4, but it was reverted because of design issues. This new version is based on Robert Haas' proposal: any missing tablespaces are created during recovery before reaching consistency. Tablespaces are created as real directories, and should be deleted by later replay. CheckRecoveryConsistency ensures they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Author: Paul Guo <paulguo@gmail.com> Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions) Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions) Reviewed-by: Michaël Paquier <michael@paquier.xyz> Diagnosed-by: Paul Guo <paulguo@gmail.com> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
*	Allow "in place" tablespaces.	Alvaro Herrera	2022-07-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a backpatch to branches 10-14 of the following commits: 7170f2159fb2 Allow "in place" tablespaces. c6f2f01611d4 Fix pg_basebackup with in-place tablespaces. f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces 7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location() 5344723755bd Remove unnecessary Windows-specific basebackup code. In-place tablespaces were introduced as a testing helper mechanism, but they are going to be used for a bugfix in WAL replay to be backpatched to all stable branches. I (Álvaro) had to adjust some code to account for lack of get_dirent_type() in branches prior to 14. Author: Thomas Munro <thomas.munro@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
*	Force immediate commit after CREATE DATABASE etc in extended protocol.	Tom Lane	2022-07-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
*	Fix ruleutils issues with dropped cols in functions-returning-composite.	Tom Lane	2022-07-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
*	Fix assertion failure and segmentation fault in backup code.	Fujii Masao	2022-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a non-exclusive backup is canceled, do_pg_abort_backup() is called and resets some variables set by pg_backup_start (pg_start_backup in v14 or before). But previously it forgot to reset the session state indicating whether a non-exclusive backup is in progress or not in this session. This issue could cause an assertion failure when the session running BASE_BACKUP is terminated after it executed pg_backup_start and pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause a segmentation fault when pg_backup_stop is called after BASE_BACKUP in the same session is canceled. This commit fixes the issue by making do_pg_abort_backup reset that session state. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
*	Prevent BASE_BACKUP in the middle of another backup in the same session.	Fujii Masao	2022-07-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Multiple non-exclusive backups are able to be run conrrently in different sessions. But, in the same session, only one non-exclusive backup can be run at the same moment. If pg_backup_start (pg_start_backup in v14 or before) is called in the middle of another non-exclusive backup in the same session, an error is thrown. However, previously, in logical replication walsender mode, even if that walsender session had already called pg_backup_start and started a non-exclusive backup, it could execute BASE_BACKUP command and start another non-exclusive backup. Which caused subsequent pg_backup_stop to throw an error because BASE_BACKUP unexpectedly reset the session state marked by pg_backup_start. This commit prevents BASE_BACKUP command in the middle of another non-exclusive backup in the same session. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
*	Re-add SPICleanup for ABI compatibility in stable branch	Peter Eisentraut	2022-07-18
\| \| \| \| \| \| \| \|	This fixes an ABI break introduced by 293f5c5f496cd8ce87c65b393613da675fc0bb8d. Author: Markus Wanner <markus.wanner@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com
*	Make dsm_impl_posix_resize more future-proof.	Thomas Munro	2022-07-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4518c798 blocks signals for a short region of code, but it assumed that whatever called it had the signal mask set to UnBlockSig on entry. That may be true today (or may even not be, in extensions in the wild), but it would be better not to make that assumption. We should save-and-restore the caller's signal mask. The PG_SETMASK() portability macro couldn't be used for that, which is why it wasn't done before. But... considering that commit a65e0864 established back in 9.6 that supported POSIX systems have sigprocmask(), and that this is POSIX-only code, there is no reason not to use standard sigprocmask() directly to achieve that. Back-patch to all supported releases, like 4518c798 and 80845b7c. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com
*	Don't clobber postmaster sigmask in dsm_impl_resize.	Thomas Munro	2022-07-15
\| \| \| \| \| \| \| \| \| \| \| \|	Commit 4518c798 intended to block signals in regular backends that allocate DSM segments, but dsm_impl_resize() is also reached by dsm_postmaster_startup(). It's not OK to clobber the postmaster's signal mask, so only manipulate the signal mask when under the postmaster. Back-patch to all releases, like 4518c798. Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com
*	Block signals while allocating DSM memory.	Thomas Munro	2022-07-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Linux, we call posix_fallocate() on shm_open()'d memory to avoid later potential SIGBUS (see commit 899bd785). Based on field reports of systems stuck in an EINTR retry loop there, there, we made it possible to break out of that loop via slightly odd coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from the loop (see commit 422952ee). On further reflection, that was not a great choice for at least two reasons: 1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing and the EINTR error would be surfaced to the user. 2. If EINTR was reported but neither QueryCancelPending nor ProcDiePending was set, then we'd dutifully retry, but with a bit more understanding of how posix_fallocate() works, it's now clear that you can get into a loop that never terminates. posix_fallocate() is not a function that can do some of the job and tell you about progress if it's interrupted, it has to undo what it's done so far and report EINTR, and if signals keep arriving faster than it can complete (cf recovery conflict signals), you're stuck. Therefore, for now, we'll simply block most signals to guarantee progress. SIGQUIT is not blocked (see InitPostmasterChild()), because its expected handler doesn't return, and unblockable signals like SIGCONT are not expected to arrive at a high rate. For good measure, we'll include the ftruncate() call in the blocked region, and add a retry loop. Back-patch to all supported releases. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Nicola Contu <nicola.contu@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
*	Fix lock assertions in dshash.c.	Thomas Munro	2022-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dshash.c previously maintained flags to be able to assert that you didn't hold any partition lock. These flags could get out of sync with reality in error scenarios. Get rid of all that, and make assertions about the locks themselves instead. Since LWLockHeldByMe() loops internally, we don't want to put that inside another loop over all partition locks. Introduce a new debugging-only interface LWLockAnyHeldByMe() to avoid that. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system, and later showed up in reality while working on commit 389869af. Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
*	Fix \watch's interaction with libedit on ^C.	Thomas Munro	2022-07-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as well as sending an interrupt signal (depending on stty settings). At least libedit (but maybe also libreadline) is then confused about the current cursor location, and corrupts the display if you try to scroll back. Fix, by moving to a new line before the next prompt is displayed. Back-patch to all supported released. Author: Pavel Stehule <pavel.stehule@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us
*	Fix alias matching in transformLockingClause().	Dean Rasheed	2022-07-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When locking a specific named relation for a FOR [KEY] UPDATE/SHARE clause, transformLockingClause() finds the relation to lock by scanning the rangetable for an RTE with a matching eref->aliasname. However, it failed to account for the visibility rules of a join RTE. If a join RTE doesn't have a user-supplied alias, it will have a generated eref->aliasname of "unnamed_join" that is not visible as a relation name in the parse namespace. Such an RTE needs to be skipped, otherwise it might be found in preference to a regular base relation with a user-supplied alias of "unnamed_join", preventing it from being locked. In addition, if a join RTE doesn't have a user-supplied alias, but does have a join_using_alias, then the RTE needs to be matched using that alias rather than the generated eref->aliasname, otherwise a misleading "relation not found" error will be reported rather than a "join cannot be locked" error. Backpatch all the way, except for the second part which only goes back to 14, where JOIN USING aliases were added. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
*	Remove %error-verbose directive from jsonpath parser	Andrew Dunstan	2022-07-03
\| \| \| \| \| \| \| \| \| \| \|	None of the other bison parsers contains this directive, and it gives rise to some unfortunate and impenetrable messages, so just remove it. Backpatch to release 12, where it was introduced. Per gripe from Erik Rijkers Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl
*	Fix previous commit's ecpg_clocale for ppc Darwin.	Noah Misch	2022-07-02
\| \| \| \| \| \| \| \| \| \|	Per buildfarm member prairiedog, this platform rejects uninitialized global variables in shared libraries. Back-patch to v10, like the addition of the variable. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20220703030619.GB2378460@rfd.leadboat.com
*	ecpglib: call newlocale() once per process.	Noah Misch	2022-07-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ecpglib has been calling it once per SQL query and once per EXEC SQL GET DESCRIPTOR. Instead, if newlocale() has not succeeded before, call it while establishing a connection. This mitigates three problems: - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently proceeded without the intended locale change. - On AIX, each newlocale()+freelocale() cycle leaked memory. - newlocale() CPU usage may have been nontrivial. Fail the connection attempt if newlocale() fails. Rearrange ecpg_do_prologue() to validate the connection before its uselocale(). The sort of program that may regress is one running in an environment where newlocale() fails. If that program establishes connections without running SQL statements, it will stop working in response to this change. I'm betting against the importance of such an ECPG use case. Most SQL execution (any using ECPGdo()) has long required newlocale() success, so there's little a connection could do without newlocale(). Back-patch to v10 (all supported versions). Reviewed by Tom Lane. Reported by Guillaume Lelarge. Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
*	Harden dsm_impl.c against unexpected EEXIST.	Thomas Munro	2022-07-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we trusted the OS not to report EEXIST unless we'd passed in IPC_CREAT \| IPC_EXCL or O_CREAT \| O_EXCL, as appropriate. Solaris's shm_open() can in fact do that, causing us to crash because we didn't ereport and then we blithely assumed the mapping was successful. Let's treat EEXIST just like any other error, unless we're actually trying to create a new segment. This applies to shm_open(), where this behavior has been seen, and also to the equivalent operations for our sysv and mmap modes just on principle. Based on the underlying reason for the error, namely contention on a lock file managed by Solaris librt for each distinct name, this problem is only likely to happen on 15 and later, because the new shared memory stats system produces shm_open() calls for the same path from potentially large numbers of backends concurrently during authentication. Earlier releases only shared memory segments between a small number of parallel workers under one Gather node. You could probably hit it if you tried hard enough though, and we should have been more defensive in the first place. Therefore, back-patch to all supported releases. Per build farm animal margay. This isn't the end of the story, though, it just changes random crashes into random "File exists" errors; more work needed for a green build farm. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
*	Fix visibility check when XID is committed in CLOG but not in procarray.	Heikki Linnakangas	2022-06-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
*	Fix PostgreSQL::Test aliasing for Perl v5.10.1.	Noah Misch	2022-06-25
\| \| \| \| \| \| \| \| \|	This Perl segfaults if a declaration of the to-be-aliased package precedes the aliasing itself. Per buildfarm members lapwing and wrasse. Like commit 20911775de4ab7ac3ecc68bd714cb3ed0fd68b6a, back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220625171533.GA2012493@rfd.leadboat.com
*	CREATE INDEX: use the original userid for more ACL checks.	Noah Misch	2022-06-25
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
*	For PostgreSQL::Test compatibility, alias entire package symbol tables.	Noah Misch	2022-06-25
\| \| \| \| \| \| \| \| \| \| \| \|	Remove the need to edit back-branch-specific code sites when back-patching the addition of a PostgreSQL::Test::Utils symbol. Replace per-symbol, incomplete alias lists. Give old and new package names the same EXPORT and EXPORT_OK semantics. Back-patch to v10 (all supported versions). Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20220622072144.GD4167527@rfd.leadboat.com