aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Add missing fields to _outConstraint()Peter Eisentraut2022-08-13
| | | | | | | | As of 897795240cfaaed724af2f53ed2c50c9862f951f, check constraints can be declared invalid. But that patch didn't update _outConstraint() to also show the relevant struct fields (which were only applicable to foreign keys before that). This currently only affects debugging output, so no impact in practice.
* Fix _outConstraint() for "identity" constraintsPeter Eisentraut2022-08-12
| | | | | | | | | The set of fields printed by _outConstraint() in the CONSTR_IDENTITY case didn't match the set of fields actually used in that case. (The code was probably uncarefully copied from the CONSTR_DEFAULT case.) Fix that by using the right set of fields. Since there is no read support for this node type, this is really just for debugging output right now, so it doesn't affect anything important.
* Fix catalog lookup with the wrong snapshot during logical decoding.Amit Kapila2022-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
* Fix handling of R/W expanded datums that are passed to SQL functions.Tom Lane2022-08-10
| | | | | | | | | | | | | | | | | | | fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
* In extensions, don't replace objects not belonging to the extension.Tom Lane2022-08-08
| | | | | | | | | | | | | | | | | | | | | | | Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625
* Translation updatesAlvaro Herrera2022-08-08
| | | | | Source-Git-URL: ssh://git@git.postgresql.org/pgtranslation/messages.git Source-Git-Hash: efdf4e068bcb504ef277413196f978621726bda5
* BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checkingAlvaro Herrera2022-08-05
| | | | | | | | | | | | | | | | That bit is unlogged and therefore it's wrong to consider it in WAL page comparison. Add a test that tickles the case, as branch testing technology allows. This has been a problem ever since wal consistency checking was introduced (commit a507b86900f6 for pg10), so backpatch to all supported branches. Author: 王海洋 (Haiyang Wang) <wanghaiyang.001@bytedance.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com
* Add HINT for restartpoint race with KeepFileRestoredFromArchive().Noah Misch2022-08-05
| | | | | | | | | | The five commits ending at cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 closed this race condition for v15+. For v14 through v10, add a HINT to discourage studying the cosmetic problem. Reviewed by Kyotaro Horiguchi and David Steele. Discussion: https://postgr.es/m/20220731061747.GA3692882@rfd.leadboat.com
* Fix ENABLE/DISABLE TRIGGER to handle recursion correctlyAlvaro Herrera2022-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db9b did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that 86f575948c77 had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of 86f575948c77 that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like bbb927b4db9b. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
* Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.Tom Lane2022-08-04
| | | | | | | | | | | | | Ordinarily the functions called in this loop ought to have plenty of CFIs themselves; but we've now seen a case where no such CFI is reached, making the loop uninterruptible. Even though that's from a recently-introduced bug, it seems prudent to install a CFI at the loop level in all branches. Per discussion of bug #17558 from Andrew Kesper (an actual fix for that bug will follow). Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
* Check maximum number of columns in function RTEs, too.Tom Lane2022-08-01
| | | | | | | | | | | | | | | I thought commit fd96d14d9 had plugged all the holes of this sort, but no, function RTEs could produce oversize tuples too, either via long coldeflists or just from multiple functions in one RTE. (I'm pretty sure the other variants of base RTEs aren't a problem, because they ultimately refer to either a table or a sub-SELECT, whose widths are enforced elsewhere. But we explicitly allow join RTEs to be overwidth, as long as you don't try to form their tuple result.) Per further discussion of bug #17561. As before, patch all branches. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
* In transformRowExpr(), check for too many columns in the row.Tom Lane2022-07-29
| | | | | | | | | | | | | | | | | | | | A RowExpr with more than MaxTupleAttributeNumber columns would fail at execution anyway, since we cannot form a tuple datum with more than that many columns. While heap_form_tuple() has a check for too many columns, it emerges that there are some intermediate bits of code that don't check and can be driven to failure with sufficiently many columns. Checking this at parse time seems like the most appropriate place to install a defense, since we already check SELECT list length there. While at it, make the SELECT-list-length error use the same errcode (TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic PROGRAM_LIMIT_EXCEEDED. Per bug #17561 from Egor Chindyaskin. The given test case crashes in all supported branches (and probably a lot further back), so patch all. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
* Fix replay of create database records on standbyAlvaro Herrera2022-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Crash recovery on standby may encounter missing directories when replaying database-creation WAL records. Prior to this patch, the standby would fail to recover in such a case; however, the directories could be legitimately missing. Consider the following sequence of commands: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, crash recovery must be able to continue. A fix for this problem was already attempted in 49d9cfc68bf4, but it was reverted because of design issues. This new version is based on Robert Haas' proposal: any missing tablespaces are created during recovery before reaching consistency. Tablespaces are created as real directories, and should be deleted by later replay. CheckRecoveryConsistency ensures they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Author: Paul Guo <paulguo@gmail.com> Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions) Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions) Reviewed-by: Michaël Paquier <michael@paquier.xyz> Diagnosed-by: Paul Guo <paulguo@gmail.com> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
* Allow "in place" tablespaces.Alvaro Herrera2022-07-27
| | | | | | | | | | | | | | | | | | | | | | This is a backpatch to branches 10-14 of the following commits: 7170f2159fb2 Allow "in place" tablespaces. c6f2f01611d4 Fix pg_basebackup with in-place tablespaces. f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces 7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location() 5344723755bd Remove unnecessary Windows-specific basebackup code. In-place tablespaces were introduced as a testing helper mechanism, but they are going to be used for a bugfix in WAL replay to be backpatched to all stable branches. I (Álvaro) had to adjust some code to account for lack of get_dirent_type() in branches prior to 14. Author: Thomas Munro <thomas.munro@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
* Force immediate commit after CREATE DATABASE etc in extended protocol.Tom Lane2022-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
* Fix ruleutils issues with dropped cols in functions-returning-composite.Tom Lane2022-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
* Fix assertion failure and segmentation fault in backup code.Fujii Masao2022-07-20
| | | | | | | | | | | | | | | | | | | | | | When a non-exclusive backup is canceled, do_pg_abort_backup() is called and resets some variables set by pg_backup_start (pg_start_backup in v14 or before). But previously it forgot to reset the session state indicating whether a non-exclusive backup is in progress or not in this session. This issue could cause an assertion failure when the session running BASE_BACKUP is terminated after it executed pg_backup_start and pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause a segmentation fault when pg_backup_stop is called after BASE_BACKUP in the same session is canceled. This commit fixes the issue by making do_pg_abort_backup reset that session state. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
* Prevent BASE_BACKUP in the middle of another backup in the same session.Fujii Masao2022-07-20
| | | | | | | | | | | | | | | | | | | | | | | | Multiple non-exclusive backups are able to be run conrrently in different sessions. But, in the same session, only one non-exclusive backup can be run at the same moment. If pg_backup_start (pg_start_backup in v14 or before) is called in the middle of another non-exclusive backup in the same session, an error is thrown. However, previously, in logical replication walsender mode, even if that walsender session had already called pg_backup_start and started a non-exclusive backup, it could execute BASE_BACKUP command and start another non-exclusive backup. Which caused subsequent pg_backup_stop to throw an error because BASE_BACKUP unexpectedly reset the session state marked by pg_backup_start. This commit prevents BASE_BACKUP command in the middle of another non-exclusive backup in the same session. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
* Re-add SPICleanup for ABI compatibility in stable branchPeter Eisentraut2022-07-18
| | | | | | | | This fixes an ABI break introduced by 293f5c5f496cd8ce87c65b393613da675fc0bb8d. Author: Markus Wanner <markus.wanner@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com
* Make dsm_impl_posix_resize more future-proof.Thomas Munro2022-07-16
| | | | | | | | | | | | | | | | | | | | Commit 4518c798 blocks signals for a short region of code, but it assumed that whatever called it had the signal mask set to UnBlockSig on entry. That may be true today (or may even not be, in extensions in the wild), but it would be better not to make that assumption. We should save-and-restore the caller's signal mask. The PG_SETMASK() portability macro couldn't be used for that, which is why it wasn't done before. But... considering that commit a65e0864 established back in 9.6 that supported POSIX systems have sigprocmask(), and that this is POSIX-only code, there is no reason not to use standard sigprocmask() directly to achieve that. Back-patch to all supported releases, like 4518c798 and 80845b7c. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com
* Don't clobber postmaster sigmask in dsm_impl_resize.Thomas Munro2022-07-15
| | | | | | | | | | | | Commit 4518c798 intended to block signals in regular backends that allocate DSM segments, but dsm_impl_resize() is also reached by dsm_postmaster_startup(). It's not OK to clobber the postmaster's signal mask, so only manipulate the signal mask when under the postmaster. Back-patch to all releases, like 4518c798. Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com
* Block signals while allocating DSM memory.Thomas Munro2022-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Linux, we call posix_fallocate() on shm_open()'d memory to avoid later potential SIGBUS (see commit 899bd785). Based on field reports of systems stuck in an EINTR retry loop there, there, we made it possible to break out of that loop via slightly odd coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from the loop (see commit 422952ee). On further reflection, that was not a great choice for at least two reasons: 1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing and the EINTR error would be surfaced to the user. 2. If EINTR was reported but neither QueryCancelPending nor ProcDiePending was set, then we'd dutifully retry, but with a bit more understanding of how posix_fallocate() works, it's now clear that you can get into a loop that never terminates. posix_fallocate() is not a function that can do some of the job and tell you about progress if it's interrupted, it has to undo what it's done so far and report EINTR, and if signals keep arriving faster than it can complete (cf recovery conflict signals), you're stuck. Therefore, for now, we'll simply block most signals to guarantee progress. SIGQUIT is not blocked (see InitPostmasterChild()), because its expected handler doesn't return, and unblockable signals like SIGCONT are not expected to arrive at a high rate. For good measure, we'll include the ftruncate() call in the blocked region, and add a retry loop. Back-patch to all supported releases. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Nicola Contu <nicola.contu@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
* Fix lock assertions in dshash.c.Thomas Munro2022-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | dshash.c previously maintained flags to be able to assert that you didn't hold any partition lock. These flags could get out of sync with reality in error scenarios. Get rid of all that, and make assertions about the locks themselves instead. Since LWLockHeldByMe() loops internally, we don't want to put that inside another loop over all partition locks. Introduce a new debugging-only interface LWLockAnyHeldByMe() to avoid that. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system, and later showed up in reality while working on commit 389869af. Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
* Fix alias matching in transformLockingClause().Dean Rasheed2022-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | When locking a specific named relation for a FOR [KEY] UPDATE/SHARE clause, transformLockingClause() finds the relation to lock by scanning the rangetable for an RTE with a matching eref->aliasname. However, it failed to account for the visibility rules of a join RTE. If a join RTE doesn't have a user-supplied alias, it will have a generated eref->aliasname of "unnamed_join" that is not visible as a relation name in the parse namespace. Such an RTE needs to be skipped, otherwise it might be found in preference to a regular base relation with a user-supplied alias of "unnamed_join", preventing it from being locked. In addition, if a join RTE doesn't have a user-supplied alias, but does have a join_using_alias, then the RTE needs to be matched using that alias rather than the generated eref->aliasname, otherwise a misleading "relation not found" error will be reported rather than a "join cannot be locked" error. Backpatch all the way, except for the second part which only goes back to 14, where JOIN USING aliases were added. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
* Remove %error-verbose directive from jsonpath parserAndrew Dunstan2022-07-03
| | | | | | | | | | | None of the other bison parsers contains this directive, and it gives rise to some unfortunate and impenetrable messages, so just remove it. Backpatch to release 12, where it was introduced. Per gripe from Erik Rijkers Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl
* Harden dsm_impl.c against unexpected EEXIST.Thomas Munro2022-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we trusted the OS not to report EEXIST unless we'd passed in IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate. Solaris's shm_open() can in fact do that, causing us to crash because we didn't ereport and then we blithely assumed the mapping was successful. Let's treat EEXIST just like any other error, unless we're actually trying to create a new segment. This applies to shm_open(), where this behavior has been seen, and also to the equivalent operations for our sysv and mmap modes just on principle. Based on the underlying reason for the error, namely contention on a lock file managed by Solaris librt for each distinct name, this problem is only likely to happen on 15 and later, because the new shared memory stats system produces shm_open() calls for the same path from potentially large numbers of backends concurrently during authentication. Earlier releases only shared memory segments between a small number of parallel workers under one Gather node. You could probably hit it if you tried hard enough though, and we should have been more defensive in the first place. Therefore, back-patch to all supported releases. Per build farm animal margay. This isn't the end of the story, though, it just changes random crashes into random "File exists" errors; more work needed for a green build farm. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
* Fix visibility check when XID is committed in CLOG but not in procarray.Heikki Linnakangas2022-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
* CREATE INDEX: use the original userid for more ACL checks.Noah Misch2022-06-25
| | | | | | | | | | | | | Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
* Fix memory leak due to LogicalRepRelMapEntry.attrmap.Amit Kapila2022-06-23
| | | | | | | | | | | | | | When rebuilding the relation mapping on subscribers, we were not releasing the attribute mapping's memory which was no longer required. The attribute mapping used in logical tuple conversion was refactored in PG13 (by commit e1551f96e6) but we forgot to update the related code that frees the attribute map. Author: Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila, Shi yu Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Fix SPI's handling of errors during transaction commit.Tom Lane2022-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. This is a back-patch of commit 2e517818f. That's now aged long enough to reduce the concerns about whether it will break something, and we do need to ensure that supported branches will work with Python 3.11. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
* Un-break whole-row Vars referencing domain-over-composite types.Tom Lane2022-06-10
| | | | | | | | | | | | | | | | | In commit ec62cb0aa, I foolishly replaced ExecEvalWholeRowVar's lookup_rowtype_tupdesc_domain call with just lookup_rowtype_tupdesc, because I didn't see how a domain could be involved there, and there were no regression test cases to jog my memory. But the existing code was correct, so revert that change and add a test case showing why it's necessary. (Note: per comment in struct DatumTupleFields, it is correct to produce an output tuple that's labeled with the base composite type, not the domain; hence just blindly looking through the domain is correct here.) Per bug #17515 from Dan Kubb. Back-patch to v11 where domains over composites became a thing. Discussion: https://postgr.es/m/17515-a24737438363aca0@postgresql.org
* Fix off-by-one loop termination condition in pg_stat_get_subscription().Tom Lane2022-06-07
| | | | | | | | | | | | | | | | pg_stat_get_subscription scanned one more LogicalRepWorker array entry than is really allocated. In the worst case this could lead to SIGSEGV, if the LogicalRepCtx data structure is near the end of shared memory. That seems quite unlikely though (thanks to the ordering of calls in CreateSharedMemoryAndSemaphores) and we've heard no field reports of it. A more likely misbehavior is one row of garbage data in the function's result, but even that is not real likely because of the check that the pid field matches some live backend. Report and fix by Kuntal Ghosh. This bug is old, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com
* Silence compiler warnings from some older compilers.Tom Lane2022-06-01
| | | | | | | | | | | | Since a117cebd6, some older gcc versions issue "variable may be used uninitialized in this function" complaints for brin_summarize_range. Silence that using the same coding pattern as in bt_index_check_internal; arguably, a117cebd6 had too narrow a view of which compilers might give trouble. Nathan Bossart and Tom Lane. Back-patch as the previous commit was. Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13
* Ensure ParseTzFile() closes the input file after failing.Tom Lane2022-05-31
| | | | | | | | | | | | | | | We hadn't noticed this because (a) few people feed invalid timezone abbreviation files to the server, and (b) in typical scenarios guc.c would throw ereport(ERROR) and then transaction abort handling would silently clean up the leaked file reference. However, it was possible to observe file leakage warnings if one breaks an already-active abbreviation file, because guc.c does not throw ERROR when loading supposedly-validated settings during session start or SIGHUP processing. Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me) Discussion: https://postgr.es/m/20220530.173740.748502979257582392.horikyota.ntt@gmail.com
* Handle NULL for short descriptions of custom GUC variablesMichael Paquier2022-05-28
| | | | | | | | | | | | | | | If a short description is specified as NULL in one of the various DefineCustomXXXVariable() functions available to external modules to define a custom parameter, SHOW ALL would crash. This change teaches SHOW ALL to properly handle NULL short descriptions, as well as any code paths that manipulate it, to gain in flexibility. Note that help_config.c was already able to do that, when describing a set of GUCs for postgres --describe-config. Author: Steve Chavez Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com Backpatch-through: 10
* Remove misguided SSL key file ownership check in libpq.Tom Lane2022-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits a59c79564 et al. tried to sync libpq's SSL key file permissions checks with what we've used for years in the backend. We did not intend to create any new failure cases, but it turns out we did: restricting the key file's ownership breaks cases where the client is allowed to read a key file despite not having the identical UID. In particular a client running as root used to be able to read someone else's key file; and having seen that I suspect that there are other, less-dubious use cases that this restriction breaks on some platforms. We don't really need an ownership check, since if we can read the key file despite its having restricted permissions, it must have the right ownership --- under normal conditions anyway, and the point of this patch is that any additional corner cases where that works should be deemed allowable, as they have been historically. Hence, just drop the ownership check, and rearrange the permissions check to get rid of its faulty assumption that geteuid() can't be zero. (Note that the comparable backend-side code doesn't have to cater for geteuid() == 0, since the server rejects that very early on.) This does have the end result that the permissions safety check used for a root user's private key file is weaker than that used for anyone else's. While odd, root really ought to know what she's doing with file permissions, so I think this is acceptable. Per report from Yogendra Suralkar. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com
* Show 'AS "?column?"' explicitly when it's important.Tom Lane2022-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ruleutils.c was coded to suppress the AS label for a SELECT output expression if the column name is "?column?", which is the parser's fallback if it can't think of something better. This is fine, and avoids ugly clutter, so long as (1) nothing further up in the parse tree relies on that column name or (2) the same fallback would be assigned when the rule or view definition is reloaded. Unfortunately (2) is far from certain, both because ruleutils.c might print the expression in a different form from how it was originally written and because FigureColname's rules might change in future releases. So we shouldn't rely on that. Detecting exactly whether there is any outer-level use of a SELECT column name would be rather expensive. This patch takes the simpler approach of just passing down a flag indicating whether there *could* be any outer use; for example, the output column names of a SubLink are not referenceable, and we also do not care about the names exposed by the right-hand side of a setop. This is sufficient to suppress unwanted clutter in all but one case in the regression tests. That seems like reasonable evidence that it won't be too much in users' faces, while still fixing the cases we need to fix. Per bug #17486 from Nicolas Lutic. This issue is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
* Fix DDL deparse of CREATE OPERATOR CLASSAlvaro Herrera2022-05-20
| | | | | | | | | | | | | When an implicit operator family is created, it wasn't getting reported. Make it do so. This has always been missing. Backpatch to 10. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: Leslie LEMAIRE <leslie.lemaire@developpement-durable.gouv.fr> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquiër <michael@paquier.xyz> Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr
* Check column list length in XMLTABLE/JSON_TABLE aliasAlvaro Herrera2022-05-18
| | | | | | | | | | | | | | | | | We weren't checking the length of the column list in the alias clause of an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was possible to make the server crash by passing an overly long one. Fix it by throwing an error in that case, like the other places that deal with alias lists. In passing, modify the equivalent test used for join RTEs to look like the other ones, which was different for no apparent reason. This bug came in when XMLTABLE was born in version 10; backpatch to all stable versions. Reported-by: Wang Ke <krking@zju.edu.cn> Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org
* Fix control file update done in restartpoints still running after promotionMichael Paquier2022-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a cluster is promoted (aka the control file shows a state different than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still processing, this function could miss an update of the control file for "checkPoint" and "checkPointCopy" but still do the recycling and/or removal of the past WAL segments, assuming that the to-be-updated LSN values should be used as reference points for the cleanup. This causes a follow-up restart attempting crash recovery to fail with a PANIC on a missing checkpoint record if the end-of-recovery checkpoint triggered by the promotion did not complete while the cluster abruptly stopped or crashed before the completion of this checkpoint. The PANIC would be caused by the redo LSN referred in the control file as located in a segment already gone, recycled by the previous restartpoint with "checkPoint" out-of-sync in the control file. This commit fixes the update of the control file during restartpoints so as "checkPoint" and "checkPointCopy" are updated even if the cluster has been promoted while a restartpoint is running, to be on par with the set of WAL segments actually recycled in the end of CreateRestartPoint(). 7863ee4 has fixed this problem already on master, but the release timing of the latest point versions did not let me enough time to study and fix that on all the stable branches. Reported-by: Fujii Masao, Rui Zhao Author: Kyotaro Horiguchi Reviewed-by: Nathan Bossart, Michael Paquier Discussion: https://postgr.es/m/20220316.102444.2193181487576617583.horikyota.ntt@gmail.com Backpatch-through: 10
* Make pull_var_clause() handle GroupingFuncs exactly like Aggrefs.Tom Lane2022-05-12
| | | | | | | | | | | | | | | | | This follows in the footsteps of commit 2591ee8ec by removing one more ill-advised shortcut from planning of GroupingFuncs. It's true that we don't intend to execute the argument expression(s) at runtime, but we still have to process any Vars appearing within them, or we risk failure at setrefs.c time (or more fundamentally, in EXPLAIN trying to print such an expression). Vars in upper plan nodes have to have referents in the next plan level, whether we ever execute 'em or not. Per bug #17479 from Michael J. Sullivan. Back-patch to all supported branches. Richard Guo Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org
* Fix the logical replication timeout during large transactions.Amit Kapila2022-05-11
| | | | | | | | | | | | | | | | | | | The problem is that we don't send keep-alive messages for a long time while processing large transactions during logical replication where we don't send any data of such transactions. This can happen when the table modified in the transaction is not published or because all the changes got filtered. We do try to send the keep_alive if necessary at the end of the transaction (via WalSndWriteData()) but by that time the subscriber-side can timeout and exit. To fix this we try to send the keepalive message if required after processing certain threshold of changes. Reported-by: Fabrice Chapuis Author: Wang wei and Amit Kapila Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda Backpatch-through: 10 Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com
* Fix core dump in transformValuesClause when there are no columns.Tom Lane2022-05-09
| | | | | | | | | | | | The parser code that transformed VALUES from row-oriented to column-oriented lists failed if there were zero columns. You can't write that straightforwardly (though probably you should be able to), but the case can be reached by expanding a "tab.*" reference to a zero-column table. Per bug #17477 from Wang Ke. Back-patch to all supported branches. Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org
* Revert "Disallow infinite endpoints in generate_series() for timestamps."Tom Lane2022-05-09
| | | | | | | | | | | | | | | | | | | This reverts commit eafdf9de06e9b60168f5e47cedcfceecdc6d4b5f and its back-branch counterparts. Corey Huinker pointed out that we'd discussed this exact change back in 2016 and rejected it, on the grounds that there's at least one usage pattern with LIMIT where an infinite endpoint can usefully be used. Perhaps that argument needs to be re-litigated, but there's no time left before our back-branch releases. To keep our options open, restore the status quo ante; if we do end up deciding to change things, waiting one more quarter won't hurt anything. Rather than just doing a straight revert, I added a new test case demonstrating the usage with LIMIT. That'll at least remind us of the issue if we forget again. Discussion: https://postgr.es/m/3603504.1652068977@sss.pgh.pa.us Discussion: https://postgr.es/m/CADkLM=dzw0Pvdqp5yWKxMd+VmNkAMhG=4ku7GnCZxebWnzmz3Q@mail.gmail.com
* In REFRESH MATERIALIZED VIEW, set user ID before running user code.Noah Misch2022-05-09
| | | | | | | | | | It intended to, but did not, achieve this. Adopt the new standard of setting user ID just after locking the relation. Back-patch to v10 (all supported versions). Reviewed by Simon Riggs. Reported by Alvaro Herrera. Security: CVE-2022-1552
* Make relation-enumerating operations be security-restricted operations.Noah Misch2022-05-09
| | | | | | | | | | | | | | | | | | When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552
* Translation updatesPeter Eisentraut2022-05-09
| | | | | Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 4a507135ecc39274887f0f0ce760f964f1725579
* Fix possibility of self-deadlock in ResolveRecoveryConflictWithBufferPin().Andres Freund2022-05-02
| | | | | | | | | | | | | | | | | | | | | The tests added in 9f8a050f68d failed nearly reliably on FreeBSD in CI, and occasionally on the buildfarm. That turns out to be caused not by a bug in the test, but by a longstanding bug in recovery conflict handling. The standby timeout handler, used by ResolveRecoveryConflictWithBufferPin(), executed SendRecoveryConflictWithBufferPin() inside a signal handler. A bad idea, because the deadlock timeout handler (or a spurious latch set) could have interrupted ProcWaitForSignal(). If unlucky that could cause a self-deadlock on ProcArrayLock, if the deadlock check is in SendRecoveryConflictWithBufferPin()->CancelDBBackends(). To fix, set a flag in StandbyTimeoutHandler(), and check the flag in ResolveRecoveryConflictWithBufferPin(). Subsequently the recovery conflict tests will be backpatched. Discussion: https://postgr.es/m/20220413002626.udl7lll7f3o7nre7@alap3.anarazel.de Backpatch: 10-
* Fix typo in comment.Etsuro Fujita2022-05-02
|
* Remove inadequate assertion check in CTE inlining.Tom Lane2022-04-21
| | | | | | | | | | | | | | | | | | | | inline_cte() expected to find exactly as many references to the target CTE as its cterefcount indicates. While that should be accurate for the tree as emitted by the parser, there are some optimizations that occur upstream of here that could falsify it, notably removal of unused subquery output expressions. Trying to make the accounting 100% accurate seems expensive and doomed to future breakage. It's not really worth it, because all this code is protecting is downstream assumptions that every referenced CTE has a plan. Let's convert those assertions to regular test-and-elog just in case there's some actual problem, and then drop the failing assertion. Per report from Tomas Vondra (thanks also to Richard Guo for analysis). Back-patch to v12 where the faulty code came in. Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com