aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor
Commit message (Collapse)AuthorAge
* Remove HeapBitmapScan's skip_fetch optimizationAndres Freund2025-04-02
| | | | | | | | | | | | | | | | | | | | | The optimization does not take the removal of TIDs by a concurrent vacuum into account. The concurrent vacuum can remove dead TIDs and make pages ALL_VISIBLE while those dead TIDs are referenced in the bitmap. This can lead to a skip_fetch scan returning too many tuples. It likely would be possible to implement this optimization safely, but we don't have the necessary infrastructure in place. Nor is it clear that it's worth building that infrastructure, given how limited the skip_fetch optimization is. In the backbranches we just disable the optimization by always passing need_tuples=true to table_beginscan_bm(). We can't perform API/ABI changes in the backbranches and we want to make the change as minimal as possible. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reported-By: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/CAEze2Wg3gXXZTr6_rwC+s4-o2ZVFB5F985uUSgJTsECx6AmGcQ@mail.gmail.com Backpatch-through: 13
* Fix a few more redundant calls of GetLatestSnapshot()Heikki Linnakangas2025-03-10
| | | | | | | | | | Commit 2367503177 fixed this in RelationFindReplTupleByIndex(), but I missed two other similar cases. Per report from Ranier Vilela. Discussion: https://www.postgresql.org/message-id/CAEudQArUT1dE45WN87F-Gb7XMy_hW6x1DFd3sqdhhxP-RMDa0Q@mail.gmail.com Backpatch-through: 13
* Fix snapshot used in logical replication index lookupHeikki Linnakangas2025-03-10
| | | | | | | | | | | | | | | | | | | | | | | | The function calls GetLatestSnapshot() to acquire a fresh snapshot, makes it active, and was meant to pass it to table_tuple_lock(), but instead called GetLatestSnapshot() again to acquire yet another snapshot. It was harmless because the heap AM and all other known table AMs ignore the 'snapshot' argument anyway, but let's be tidy. In the long run, this perhaps should be redesigned so that snapshot was not needed in the first place. The table AM API uses TID + snapshot as the unique identifier for the row version, which is questionable when the row came from an index scan with a Dirty snapshot. You might lock a different row version when you use a different snapshot in the table_tuple_lock() call (a fresh MVCC snapshot) than in the index scan (DirtySnapshot). However, in the heap AM and other AMs where the TID alone identifies the row version, it doesn't matter. So for now, just fix the obvious albeit harmless bug. This has been wrong ever since the table AM API was introduced in commit 5db6df0c01, so backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/83d243d6-ad8d-4307-8b51-2ee5844f6230@iki.fi Backpatch-through: 13
* Fix assertion on dereferenced objectDaniel Gustafsson2025-02-14
| | | | | | | | | | | | | Commit 27cc7cd2bc8a accidentally placed the assertion ensuring that the pointer isn't NULL after it had already been accessed. Fix by moving the pointer dereferencing to after the assertion. Backpatch to all supported branches. Author: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/1618848d-cdc7-414b-9c03-08cf4bef4408@postgrespro.ru Backpatch-through: 13
* Fix Assert failure in WITH RECURSIVE UNION queriesDavid Rowley2024-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the non-recursive part of a recursive CTE ended up using TTSOpsBufferHeapTuple as the table slot type, then a duplicate value could cause an Assert failure in CheckOpSlotCompatibility() when checking the hash table for the duplicate value. The expected slot type for the deform step was TTSOpsMinimalTuple so the Assert failed when the TTSOpsBufferHeapTuple slot was used. This is a long-standing bug which we likely didn't notice because it seems much more likely that the non-recursive term would have required projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility is ok with. There doesn't seem to be any harm done here other than the Assert failure. Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types require tuple deformation, so the EEOP_*_FETCHSOME ExprState step would have properly existed in the ExprState. The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops' parameter. This means the ExprState's EEOP_*_FETCHSOME step won't expect a fixed slot type. This makes CheckOpSlotCompatibility() happy as no checking is performed when the ExprEvalStep is not expecting a fixed slot type. Reported-by: Richard Guo Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com Backpatch-through: 13, all supported versions
* Simplify executor's determination of whether to use parallelism.Tom Lane2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our parallel-mode code only works when we are executing a query in full, so ExecutePlan must disable parallel mode when it is asked to do partial execution. The previous logic for this involved passing down a flag (variously named execute_once or run_once) from callers of ExecutorRun or PortalRun. This is overcomplicated, and unsurprisingly some of the callers didn't get it right, since it requires keeping state that not all of them have handy; not to mention that the requirements for it were undocumented. That led to assertion failures in some corner cases. The only state we really need for this is the existing QueryDesc.already_executed flag, so let's just put all the responsibility in ExecutePlan. (It could have been done in ExecutorRun too, leading to a slightly shorter patch -- but if there's ever more than one caller of ExecutePlan, it seems better to have this logic in the subroutine than the callers.) This makes those ExecutorRun/PortalRun parameters unnecessary. In master it seems okay to just remove them, returning the API for those functions to what it was before parallelism. Such an API break is clearly not okay in stable branches, but for them we can just leave the parameters in place after documenting that they do nothing. Per report from Yugo Nagata, who also reviewed and tested this patch. Back-patch to all supported branches. Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
* Fix NULLIF()'s handling of read-write expanded objects.Tom Lane2024-11-25
| | | | | | | | | | | | | | | | | | | | | | | If passed a read-write expanded object pointer, the EEOP_NULLIF code would hand that same pointer to the equality function and then (unless equality was reported) also return the same pointer as its value. This is no good, because a function that receives a read-write expanded object pointer is fully entitled to scribble on or even delete the object, thus corrupting the NULLIF output. (This problem is likely unobservable with the equality functions provided in core Postgres, but it's easy to demonstrate with one coded in plpgsql.) To fix, make sure the pointer passed to the equality function is read-only. We can still return the original read-write pointer as the NULLIF result, allowing optimization of later operations. Per bug #18722 from Alexander Lakhin. This has been wrong since we invented expanded objects, so back-patch to all supported branches. Discussion: https://postgr.es/m/18722-fd9e645448cc78b4@postgresql.org
* Ensure cached plans are correctly marked as dependent on role.Nathan Bossart2024-11-11
| | | | | | | | | | | | | | If a CTE, subquery, sublink, security invoker view, or coercion projection references a table with row-level security policies, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Reported-by: Wolfgang Walther Reviewed-by: Noah Misch Security: CVE-2024-10976 Backpatch-through: 12
* Improve fix for not entering parallel mode when holding interrupts.Tom Lane2024-11-08
| | | | | | | | | | | | | | | | | | | | | | | Commit ac04aa84a put the shutoff for this into the planner, which is not ideal because it doesn't prevent us from re-using a previously made parallel plan. Revert the planner change and instead put the shutoff into InitializeParallelDSM, modeling it on the existing code there for recovering from failure to allocate a DSM segment. However, that code path is mostly untested, and testing a bit harder showed there's at least one bug: ExecHashJoinReInitializeDSM is not prepared for us to have skipped doing parallel DSM setup. I also thought the Assert in ReinitializeParallelWorkers is pretty ill-advised, and replaced it with a silent Min() operation. The existing test case added by ac04aa84a serves fine to test this version of the fix, so no change needed there. Patch by me, but thanks to Noah Misch for the core idea that we could shut off worker creation when !INTERRUPTS_CAN_BE_PROCESSED. Back-patch to v12, as ac04aa84a was. Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
* Remove unnecessary word in a commentAmit Langote2024-10-23
| | | | | | | | | Relations opened by the executor are only closed once in ExecCloseRangeTableRelations(), so the word "again" in the comment for ExecGetRangeTableRelation() is misleading and unnecessary. Discussion: https://postgr.es/m/CA+HiwqHnw-zR+u060i3jp4ky5UR0CjByRFQz50oZ05de7wUg=Q@mail.gmail.com Backpatch-through: 12
* Fix extreme skew detection in Parallel Hash Join.Thomas Munro2024-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After repartitioning the inner side of a hash join that would have exceeded the allowed size, we check if all the tuples from a parent partition moved to one child partition. That is evidence that it contains duplicate keys and later attempts to repartition will also fail, so we should give up trying to limit memory (for lack of a better fallback strategy). A thinko prevented the check from working correctly in partition 0 (the one that is partially loaded into memory already). After repartitioning, we should check for extreme skew if the *parent* partition's space_exhausted flag was set, not the child partition's. The consequence was repeated futile repartitioning until per-partition data exceeded various limits including "ERROR: invalid DSA memory alloc request size 1811939328", OS allocation failure, or temporary disk space errors. (We could also do something about some of those symptoms, but that's material for separate patches.) This problem only became likely when PostgreSQL 16 introduced support for Parallel Hash Right/Full Join, allowing NULL keys into the hash table. Repartitioning always leaves NULL in partition 0, no matter how many times you do it, because the hash value is all zero bits. That's unlikely for other hashed values, but they might still have caused wasted extra effort before giving up. Back-patch to all supported releases. Reported-by: Craig Milhiser <craig@milhiser.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CA%2BwnhO1OfgXbmXgC4fv_uu%3DOxcDQuHvfoQ4k0DFeB0Qqd-X-rQ%40mail.gmail.com
* Further refine _SPI_execute_plan's rule for atomic execution.Tom Lane2024-10-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2dc1deaea turns out to have been still a brick shy of a load, because CALL statements executing within a plpgsql exception block could still pass the wrong snapshot to stable functions within the CALL's argument list. That happened because standard_ProcessUtility forces isAtomicContext to true if IsTransactionBlock is true, which it always will be inside a subtransaction. Then ExecuteCallStmt would think it does not need to push a new snapshot --- but _SPI_execute_plan didn't do so either, since it thought it was in nonatomic mode. The best fix for this seems to be for _SPI_execute_plan to operate in atomic execution mode if IsSubTransaction() is true, even when the SPI context as a whole is non-atomic. This makes _SPI_execute_plan have the same rules about when non-atomic execution is allowed as _SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed, which seems appropriately symmetric. (If anyone ever tries to allow COMMIT/ROLLBACK inside a subtransaction, this would all need to be rethought ... but I'm unconvinced that such a thing could be logically consistent at all.) For further consistency, also check IsSubTransaction() in SPI_inside_nonatomic_context. That does not matter for its one present-day caller StartTransaction, which can't be reached inside a subtransaction. But if any other callers ever arise, they'd presumably want this definition. Per bug #18656 from Alexander Alehin. Back-patch to all supported branches, like previous fixes in this area. Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org
* For inplace update durability, make heap_update() callers wait.Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
* Expand comments and add an assertion in nodeModifyTable.c.Noah Misch2024-06-27
| | | | | | | | | | | Most comments concern RELKIND_VIEW. One addresses the ExecUpdate() "tupleid" parameter. A later commit will rely on these facts, but they hold already. Back-patch to v12 (all supported versions), the plan for that commit. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
* Fix behavior of stable functions called from a CALL's argument list.Tom Lane2024-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | If the CALL is within an atomic context (e.g. there's an outer transaction block), _SPI_execute_plan should acquire a fresh snapshot to execute any such functions with. We failed to do that and instead passed them the Portal snapshot, which had been acquired at the start of the current SQL command. This'd lead to seeing stale values of rows modified since the start of the command. This is arguably a bug in 84f5c2908: I failed to see that "are we in non-atomic mode" needs to be defined the same way as it is further down in _SPI_execute_plan, i.e. check !_SPI_current->atomic not just options->allow_nonatomic. Alternatively the blame could be laid on plpgsql, which is unconditionally passing allow_nonatomic = true for CALL/DO even when it knows it's in an atomic context. However, fixing it in spi.c seems like a better idea since that will also fix the problem for any extensions that may have copied plpgsql's coding pattern. While here, update an obsolete comment about _SPI_execute_plan's snapshot management. Per report from Victor Yegorov. Back-patch to all supported versions. Discussion: https://postgr.es/m/CAGnEboiRe+fG2QxuBO2390F7P8e2MQ6UyBjZSL_w1Cej+E4=Vw@mail.gmail.com
* Ensure we allocate NAMEDATALEN bytes for names in Index Only ScansDavid Rowley2024-05-01
| | | | | | | | | | | | | | | As an optimization, we store "name" columns as cstrings in btree indexes. Here we modify it so that Index Only Scans convert these cstrings back to names with NAMEDATALEN bytes rather than storing the cstring in the tuple slot, as was happening previously. Bug: #17855 Reported-by: Alexander Lakhin Reviewed-by: Alexander Lakhin, Tom Lane Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org Backpatch-through: 12, all supported versions
* Fix unnecessary use of moving-aggregate mode with non-moving frame.Tom Lane2024-03-27
| | | | | | | | | | | | | | | | | | | | When a plain aggregate is used as a window function, and the window frame start is specified as UNBOUNDED PRECEDING, the frame's head cannot move so we do not need to use moving-aggregate mode. The check for that was put into initialize_peragg(), failing to notice that ExecInitWindowAgg() calls that function before it's filled in winstate->frameOptions. Since makeNode() would have zeroed the field, this didn't provoke uninitialized-value complaints, nor would the erroneous decision have resulted in more than a little inefficiency. Still, it's wrong, so move the initialization of winstate->frameOptions earlier to make it work properly. While here, also fix a thinko in a comment. Both errors crept in in commit a9d9acbf2 which introduced the moving-aggregate mode. Spotted by Vallimaharajan G. Back-patch to all supported branches. Discussion: https://postgr.es/m/18e7f2a5167.fe36253866818.977923893562469143@zohocorp.com
* Fix EXPLAIN Bitmap heap scan to count pages with no visible tuplesHeikki Linnakangas2024-03-18
| | | | | | | | | | | | | | | | | Previously, bitmap heap scans only counted lossy and exact pages for explain when there was at least one visible tuple on the page. heapam_scan_bitmap_next_block() returned true only if there was a "valid" page with tuples to be processed. However, the lossy and exact page counters in EXPLAIN should count the number of pages represented in a lossy or non-lossy way in the constructed bitmap, regardless of whether or not the pages ultimately contained visible tuples. Backpatch to all supported versions. Author: Melanie Plageman Discussion: https://www.postgresql.org/message-id/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAAKRu_bxrXeZ2rCnY8LyeC2Ls88KpjWrQ%2BopUrXDRXdcfwFZGA@mail.gmail.com
* Fix confusion about the return rowtype of SQL-language procedures.Tom Lane2024-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a very ancient hack in check_sql_fn_retval that allows a single SELECT targetlist entry of composite type to be taken as supplying all the output columns of a function returning composite. (This is grotty and fundamentally ambiguous, but it's really hard to do nested composite-returning functions without it.) As far as I know, that doesn't cause any problems in ordinary functions. It's disastrous for procedures however. All procedures that have any output parameters are labeled with prorettype RECORD, and the CALL code expects it will get back a record with one column per output parameter, regardless of whether any of those parameters is composite. Doing something else leads to an assertion failure or core dump. This is simple enough to fix: we just need to not apply that rule when considering procedures. However, that requires adding another argument to check_sql_fn_retval, which at least in principle might be getting called by external callers. Therefore, in the back branches convert check_sql_fn_retval into an ABI-preserving wrapper around a new function check_sql_fn_retval_ext. Per report from Yahor Yuzefovich. This has been broken since we implemented procedures, so back-patch to all supported branches. Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com
* Fix type-checking of RECORD-returning functions in FROM.Tom Lane2024-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the corner case where a function returning RECORD has been simplified to a RECORD constant or an inlined ROW() expression, ExecInitFunctionScan failed to cross-check the function's result rowtype against the coldeflist provided by the calling query. That happened because get_expr_result_type is able to extract a tupdesc from such expressions, which led ExecInitFunctionScan to ignore the coldeflist. (Instead, it used the extracted tupdesc to check the function's output, which of course always succeeds.) I have not been able to demonstrate any really serious consequences from this, because if some column of the result is of the wrong type and is directly referenced by a Var of the calling query, CheckVarSlotCompatibility will catch it. However, we definitely do fail to report the case where the function returns more columns than the coldeflist expects, and in the converse case where it returns fewer columns, we get an assert failure (but, seemingly, no worse results in non-assert builds). To fix, always build the expected tupdesc from the coldeflist if there is one, and consult get_expr_result_type only when there isn't one. Also remove the failing Assert, even though it is no longer reached after this fix. It doesn't seem to be adding anything useful, since later checking will deal with cases with the wrong number of columns. The only other place I could find that is doing something similar is inline_set_returning_function. There's no live bug there because we cannot be looking at a Const or RowExpr, but for consistency change that code to agree with ExecInitFunctionScan. Per report from PetSerAl. After some debate I've concluded that this should be back-patched. There is a small risk that somebody has been relying on such a case not throwing an error, but I judge this outweighed by the risk that I've missed some way in which the failure to cross-check has worse consequences than sketched above. Discussion: https://postgr.es/m/CAKygsHSerA1eXsJHR9wft3Gn3wfHQ5RfP8XHBzF70_qcrrRvEg@mail.gmail.com
* Fix indentation in ExecParallelHashIncreaseNumBatches()Alexander Korotkov2024-01-08
| | | | Backpatch-through: 12
* Fix oversized memory allocation in Parallel Hash JoinAlexander Korotkov2024-01-07
| | | | | | | | | | | | During the calculations of the maximum for the number of buckets, take into account that later we round that to the next power of 2. Reported-by: Karen Talarico Bug: #16925 Discussion: https://postgr.es/m/16925-ec96d83529d0d629%40postgresql.org Author: Thomas Munro, Andrei Lepikhov, Alexander Korotkov Reviewed-by: Alena Rybakina Backpatch-through: 12
* Fix intra-query memory leak when a SRF returns zero rows.Tom Lane2023-10-28
| | | | | | | | | | | | | | | | | | When looping around after finding that the set-returning function returned zero rows for the current input tuple, ExecProjectSet neglected to reset either of the two memory contexts it's responsible for cleaning out. Typically this wouldn't cause much problem, because once the SRF does return at least one row, the contexts would get reset on the next call. However, if the SRF returns no rows for many input tuples in succession, quite a lot of memory could be transiently consumed. To fix, make sure we reset both contexts while looping around. Per bug #18172 from Sergei Kornilov. Back-patch to all supported branches. Discussion: https://postgr.es/m/18172-9b8c5fc1d676ded3@postgresql.org
* Fix runtime partition pruning for HASH partitioned tablesDavid Rowley2023-10-13
| | | | | | | | | | | | | | | | | | | | | | | This could only affect HASH partitioned tables with at least 2 partition key columns. If partition pruning was delayed until execution and the query contained an IS NULL qual on one of the partitioned keys, and some subsequent partitioned key was being compared to a non-Const, then this could result in a crash due to the incorrect keyno being used to calculate the stateidx for the expression evaluation code. Here we fix this by properly skipping partitioned keys which have a nullkey set. Effectively, this must be the same as what's going on inside perform_pruning_base_step(). Sergei Glukhov also provided a patch, but that's not what's being used here. Reported-by: Sergei Glukhov Reviewed-by: tender wang, Sergei Glukhov Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru Backpatch-through: 11, where runtime partition pruning was added.
* Fix memory leak in Incremental Sort rescansTomas Vondra2023-07-02
| | | | | | | | | | | | | | | | | | | | | | | | The Incremental Sort had a couple issues, resulting in leaking memory during rescans, possibly triggering OOM. The code had a couple of related flaws: 1. During rescans, the sort states were reset but then also set to NULL (despite the comment saying otherwise). ExecIncrementalSort then sees NULL and initializes a new sort state, leaking the memory used by the old one. 2. Initializing the sort state also automatically rebuilt the info about presorted keys, leaking the already initialized info. presorted_keys was also unnecessarily reset to NULL. Patch by James Coleman, based on patches by Laurenz Albe and Tom Lane. Backpatch to 13, where Incremental Sort was introduced. Author: James Coleman, Laurenz Albe, Tom Lane Reported-by: Laurenz Albe, Zu-Ming Jiang Backpatch-through: 13 Discussion: https://postgr.es/m/b2bd02dff61af15e3526293e2771f874cf2a3be7.camel%40cybertec.at Discussion: https://postgr.es/m/db03c582-086d-e7cd-d4a1-3bc722f81765%40inf.ethz.ch
* Fix order of operations in ExecEvalFieldStoreDeForm().Tom Lane2023-06-29
| | | | | | | | | | | | | | | | | | | | | | | | | If the given composite datum is toasted out-of-line, DatumGetHeapTupleHeader will perform database accesses to detoast it. That can invalidate the result of get_cached_rowtype, as documented (perhaps not plainly enough) in that function's API spec; which leads to strange errors or crashes when we try to use the TupleDesc to read the tuple. In short then, trying to update a field of a composite column could fail intermittently if the overall column value is wide enough to require toasting. We can fix the bug at no cost by just changing the order of operations, since we don't need the TupleDesc until after detoasting. (Other callers of get_cached_rowtype appear to get this right already, so there's only one bug.) Note that the added regression test case reveals this bug reliably only with debug_discard_caches/CLOBBER_CACHE_ALWAYS. Per bug #17994 from Alexander Lakhin. Sadly, this patch does not fix the missing-values issue revealed in the bug discussion; we'll need some more work to cover that. Discussion: https://postgr.es/m/17994-5c7100b51b4790e9@postgresql.org
* Use per-tuple context in ExecGetAllUpdatedColsTomas Vondra2023-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fc22b6623b (generated columns) replaced ExecGetUpdatedCols() with ExecGetAllUpdatedCols() in a couple places handling UPDATE (triggers and lock mode). However, ExecGetUpdatedCols() did exec_rt_fetch() while ExecGetAllUpdatedCols() also allocates memory through bms_union() without paying attention to the memory context and happened to use the long-lived ExecutorState, leaking the memory until the end of the query. The amount of leaked memory is proportional to the number of (updated) attributes, types of UPDATE triggers, and the number of processed rows (which for UPDATE ... FROM ... may be much higher than updated rows). Fixed by switching to the per-tuple context in GetAllUpdatedColumns(). This is fine for all in-core callers, but external callers may need to copy the result. But we're not aware of any such callers. Note the issue was introduced by fc22b6623b, but the macros were later renamed by f50e888990. Backpatch to 12, where the issue was introduced. Reported-by: Tomas Vondra Reviewed-by: Andres Freund, Tom Lane, Jakub Wartak Backpatch-through: 12 Discussion: https://postgr.es/m/222a3442-7f7d-246c-ed9b-a76209d19239@enterprisedb.com
* Fix assignment to array of domain over composite, redux.Tom Lane2023-04-15
| | | | | | | | | | | | | | | | Commit 3e310d837 taught isAssignmentIndirectionExpr() to look through CoerceToDomain nodes. That's not sufficient, because since commit 04fe805a1 it's been possible for the planner to simplify CoerceToDomain to RelabelType when the domain has no constraints to enforce. So we need to look through RelabelType too. Per bug #17897 from Alexander Lakhin. Although 3e310d837 was back-patched to v11, it seems sufficient to apply this change to v12 and later, since 04fe805a1 came in in v12. Dmitry Dolgov Discussion: https://postgr.es/m/17897-4216c546c3874044@postgresql.org
* Fix oversights in array manipulation.Tom Lane2023-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
* Ignore generated columns during apply of update/delete.Amit Kapila2023-03-23
| | | | | | | | | | | | We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having generated columns. We didn't use to ignore generated columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci Reviewed-by: Shi yu, Amit Kapila Backpatch-through: 12 Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
* Ignore dropped columns during apply of update/delete.Amit Kapila2023-03-21
| | | | | | | | | | | We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having dropped columns. We didn't use to ignore dropped columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci, Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
* Fix race in parallel hash join batch cleanup, take II.Thomas Munro2023-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit 3b8981b6, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <thomas.munro@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: David Geier <geidav.pg@gmail.com> Tested-by: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
* Fix some more cases of missed GENERATED-column updates.Tom Lane2023-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If UPDATE is forced to retry after an EvalPlanQual check, it neglected to repeat GENERATED-column computations, even though those might well have changed since we're dealing with a different tuple than before. Fixing this is mostly a matter of looping back a bit further when we retry. In v15 and HEAD that's most easily done by altering the API of ExecUpdateAct so that it includes computing GENERATED expressions. Also, if an UPDATE in a partitioned table turns into a cross-partition INSERT operation, we failed to recompute GENERATED columns. That's a bug since 8bf6ec3ba allowed partitions to have different generation expressions; although it seems to have no ill effects before that. Fixing this is messier because we can now have situations where the same query needs both the UPDATE-aligned set of GENERATED columns and the INSERT-aligned set, and it's unclear which set will be generated first (else we could hack things by forcing the INSERT-aligned set to be generated, which is indeed how fe9e658f4 made it work for MERGE). The best fix seems to be to build and store separate sets of expressions for the INSERT and UPDATE cases. That would create ABI issues in the back branches, but so far it seems we can leave this alone in the back branches. Per bug #17823 from Hisahiro Kauchi. The first part of this affects all branches back to v12 where GENERATED columns were added. Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
* Fix MULTIEXPR_SUBLINK with partitioned target tables, yet again.Tom Lane2023-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already tried to fix this in commits 3f7323cbb et al (and follow-on fixes), but now it emerges that there are still unfixed cases; moreover, these cases affect all branches not only pre-v14. I thought we had eliminated all cases of making multiple clones of an UPDATE's target list when we nuked inheritance_planner. But it turns out we still do that in some partitioned-UPDATE cases, notably including INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks it's okay to clone and modify the parent's targetlist. This fix is based on a suggestion from Andres Freund: let's stop abusing the ParamExecData.execPlan mechanism, which was only ever meant to handle initplans, and instead solve the execution timing problem by having the expression compiler move MULTIEXPR_SUBLINK steps to the front of their expression step lists. This is feasible because (a) all branches still in support compile the entire targetlist of an UPDATE into a single ExprState, and (b) we know that all MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried inside a CASE, for example. There is a minor semantics change concerning the order of execution of the MULTIEXPR's subquery versus other parts of the parent targetlist, but that seems like something we can get away with. By doing that, we no longer need to worry about whether different clones of a MULTIEXPR_SUBLINK share output Params; their usage of that data structure won't overlap. Per bug #17800 from Alexander Lakhin. Back-patch to all supported branches. In v13 and earlier, we can revert 3f7323cbb and follow-on fixes; however, I chose to keep the SubPlan.subLinkId field added in ccbb54c72. We don't need that anymore in the core code, but it's cheap enough to fill, and removing a plan node field in a minor release seems like it'd be asking for trouble. Andres Freund and Tom Lane Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org
* Add missing support for the latest SPI status codes.Dean Rasheed2023-02-22
| | | | | | | | | | | | | | | | | | | SPI_result_code_string() was missing support for SPI_OK_TD_REGISTER, and in v15 and later, it was missing support for SPI_OK_MERGE, as was pltcl_process_SPI_result(). The last of those would trigger an error if a MERGE was executed from PL/Tcl. The others seem fairly innocuous, but worth fixing. Back-patch to all supported branches. Before v15, this is just adding SPI_OK_TD_REGISTER to SPI_result_code_string(), which is unlikely to be seen by anyone, but seems worth doing for completeness. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUg8V%2BK%2BGcafOPqymxk84Y_prXgfe64PDoopjLFH6Z0Aw%40mail.gmail.com https://postgr.es/m/CAEZATCUMe%2B_KedPMM9AxKqm%3DSZogSxjUcrMe%2BsakusZh3BFcQw%40mail.gmail.com
* Disable WindowAgg inverse transitions when subplans are presentDavid Rowley2023-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | When an aggregate function is used as a WindowFunc and a tuple transitions out of the window frame, we ordinarily try to make use of the aggregate function's inverse transition function to "unaggregate" the exiting tuple. This optimization is disabled for various cases, including when the aggregate contains a volatile function. In such a case we'd be unable to ensure that the transition value was calculated to the same value during transitions and inverse transitions. Unfortunately, we did this check by calling contain_volatile_functions() which does not recursively search SubPlans for volatile functions. If the aggregate function's arguments or its FILTER clause contained a subplan with volatile functions then we'd fail to notice this. Here we fix this by just disabling the optimization when the WindowFunc contains any subplans. Volatile functions are not the only reason that a subplan may have nonrepeatable results. Bug: #17777 Reported-by: Anban Company Discussion: https://postgr.es/m/17777-860b739b6efde977%40postgresql.org Reviewed-by: Tom Lane Backpatch-through: 11
* Make new GENERATED-expressions code more bulletproof.Tom Lane2023-01-15
| | | | | | | | | | | | | | | | | In commit 8bf6ec3ba I assumed that no code path could reach ExecGetExtraUpdatedCols without having gone through ExecInitStoredGenerated. That turns out not to be the case in logical replication: if there's an ON UPDATE trigger on the target table, trigger.c will call this code before anybody has set up its generated columns. Having seen that, I don't have a lot of faith in there not being other such paths. ExecGetExtraUpdatedCols can call ExecInitStoredGenerated for itself, as long as we are willing to assume that it is only called in CMD_UPDATE operations, which on the whole seems like a safer leap of faith. Per report from Vitaly Davydov. Discussion: https://postgr.es/m/d259d69652b8c2ff50e14cda3c236c7f@postgrespro.ru
* Fix calculation of which GENERATED columns need to be updated.Tom Lane2023-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were identifying the updatable generated columns of inheritance children by transposing the calculation made for their parent. However, there's nothing that says a traditional-inheritance child can't have generated columns that aren't there in its parent, or that have different dependencies than are in the parent's expression. (At present it seems that we don't enforce that for partitioning either, which is likely wrong to some degree or other; but the case clearly needs to be handled with traditional inheritance.) Hence, drop the very-klugy-anyway "extraUpdatedCols" RTE field in favor of identifying which generated columns depend on updated columns during executor startup. In HEAD we can remove extraUpdatedCols altogether; in back branches, it's still there but always empty. Another difference between the HEAD and back-branch versions of this patch is that in HEAD we can add the new bitmap field to ResultRelInfo, but that would cause an ABI break in back branches. Like 4b3e37993, add a List field at the end of struct EState instead. Back-patch to v13. The bogus calculation is also being made in v12, but it doesn't have the same visible effect because we don't use it to decide which generated columns to recalculate; as a consequence of which the patch doesn't apply easily. I think that there might still be a demonstrable bug associated with trigger firing conditions, but that's such a weird corner-case usage that I'm content to leave it unfixed in v12. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqFshLKNvQUd1DgwJ-7tsTp=dwv7KZqXC4j2wYBV1aCDUA@mail.gmail.com Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
* Avoid reference to nonexistent array element in ExecInitAgg().Tom Lane2023-01-02
| | | | | | | | | | | | | | | | When considering an empty grouping set, we fetched phasedata->eqfunctions[-1]. Because the eqfunctions array is palloc'd, that would always be an aset pointer in released versions, and thus the code accidentally failed to malfunction (since it would do nothing unless it found a null pointer). Nonetheless this seems like trouble waiting to happen, so add a check for length == 0. It's depressing that our valgrind testing did not catch this. Maybe we should reconsider the choice to not mark that word NOACCESS? Richard Guo Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com
* Fix copy-and-pasteo in comment.Etsuro Fujita2022-11-02
|
* Future-proof the recursion inside ExecShutdownNode().Tom Lane2022-09-19
| | | | | | | | | | | | | | | | | | | | | | | The API contract for planstate_tree_walker() callbacks is that they take a PlanState pointer and a context pointer. Somebody figured they could save a couple lines of code by ignoring that, and passing ExecShutdownNode itself as the walker even though it has but one argument. Somewhat remarkably, we've gotten away with that so far. However, it seems clear that the upcoming C2x standard means to forbid such cases, and compilers that actively break such code likely won't be far behind. So spend the extra few lines of code to do it honestly with a separate walker function. In HEAD, we might as well go further and remove ExecShutdownNode's useless return value. I left that as-is in back branches though, to forestall complaints about ABI breakage. Back-patch, with the thought that this might become of practical importance before our stable branches are all out of service. It doesn't seem to be fixing any live bug on any currently known platform, however. Discussion: https://postgr.es/m/208054.1663534665@sss.pgh.pa.us
* Repair rare failure of MULTIEXPR_SUBLINK subplans in inherited updates.Tom Lane2022-08-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to v14, if we have a MULTIEXPR SubPlan (that is, use of the syntax UPDATE ... SET (c1, ...) = (SELECT ...)) in an UPDATE with an inherited or partitioned target table, inheritance_planner() will clone the targetlist and therefore also the MULTIEXPR SubPlan and the Param nodes referencing it for each child target table. Up to now, we've allowed all the clones to share the underlying subplan as well as the output parameter IDs -- that is, the runtime ParamExecData slots. That technique is borrowed from the far older code that supports initplans, and it works okay in that case because the cloned SubPlan nodes are essentially identical. So it doesn't matter which one of the clones the shared ParamExecData.execPlan field might point to. However, this fails to hold for MULTIEXPR SubPlans, because they can have nonempty "args" lists (values to be passed into the subplan), and those lists could get mutated to different states in the various clones. In the submitted reproducer, as well as the test case added here, one clone contains Vars with varno OUTER_VAR where another has INNER_VAR, because the child tables are respectively on the outer or inner side of the join. Sharing the execPlan pointer can result in trying to evaluate an args list that doesn't match the local execution state, with mayhem ensuing. The result often is to trigger consistency checks in the executor, but I believe this could end in a crash or incorrect updates. To fix, assign new Param IDs to each of the cloned SubPlans, so that they don't share ParamExecData slots at runtime. It still seems fine for the clones to share the underlying subplan, and extra ParamExecData slots are cheap enough that this fix shouldn't cost much. This has been busted since we invented MULTIEXPR SubPlans in 9.5. Probably the lack of previous reports is because query plans in which the different clones of a MULTIEXPR mutate to effectively-different states are pretty rare. There's no issue in v14 and later, because without inheritance_planner() there's never a reason to clone MULTIEXPR SubPlans. Per bug #17596 from Andre Lin. Patch v10-v13 only. Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
* Fix replica identity check for a partitioned table.Amit Kapila2022-08-16
| | | | | | | | | | | | | | The current publisher code checks if UPDATE or DELETE can be executed with the replica identity of the table even if it's a partitioned table. We can skip checking the replica identity for partitioned tables because the operations are actually performed on the leaf partitions (not the partitioned table). Reported-by: Brad Nicholson Author: Hou Zhijie Reviewed-by: Peter Smith, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CAMMnM%3D8i5DohH%3DYKzV0_wYuYSYvuOJoL9F5nzXTc%2ByzsG1f6rg%40mail.gmail.com
* Avoid misbehavior when hash_table_bytes < bucket_size.Tom Lane2022-08-13
| | | | | | | | | | | | | | | It's possible to reach this case when work_mem is very small and tupsize is (relatively) very large. In that case ExecChooseHashTableSize would get an assertion failure, or with asserts off it'd compute nbuckets = 0, which'd likely cause misbehavior later (I've not checked). To fix, clamp the number of buckets to be at least 1. This is due to faulty conversion of old my_log2() coding in 28d936031. Back-patch to v13, as that was. Zhang Mingli Discussion: https://postgr.es/m/beb64ca0-91e2-44ac-bf4a-7ea36275ec02@Spark
* Fix handling of R/W expanded datums that are passed to SQL functions.Tom Lane2022-08-10
| | | | | | | | | | | | | | | | | | | fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
* Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.Tom Lane2022-08-04
| | | | | | | | | | | | | Ordinarily the functions called in this loop ought to have plenty of CFIs themselves; but we've now seen a case where no such CFI is reached, making the loop uninterruptible. Even though that's from a recently-introduced bug, it seems prudent to install a CFI at the loop level in all branches. Per discussion of bug #17558 from Andrew Kesper (an actual fix for that bug will follow). Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
* Re-add SPICleanup for ABI compatibility in stable branchPeter Eisentraut2022-07-18
| | | | | | | | This fixes an ABI break introduced by cfc86f987349372dbbfc0391f9f519c0a7b27b84. Author: Markus Wanner <markus.wanner@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com
* Fix SPI's handling of errors during transaction commit.Tom Lane2022-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. This is a back-patch of commit 2e517818f. That's now aged long enough to reduce the concerns about whether it will break something, and we do need to ensure that supported branches will work with Python 3.11. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
* Un-break whole-row Vars referencing domain-over-composite types.Tom Lane2022-06-10
| | | | | | | | | | | | | | | | | In commit ec62cb0aa, I foolishly replaced ExecEvalWholeRowVar's lookup_rowtype_tupdesc_domain call with just lookup_rowtype_tupdesc, because I didn't see how a domain could be involved there, and there were no regression test cases to jog my memory. But the existing code was correct, so revert that change and add a test case showing why it's necessary. (Note: per comment in struct DatumTupleFields, it is correct to produce an output tuple that's labeled with the base composite type, not the domain; hence just blindly looking through the domain is correct here.) Per bug #17515 from Dan Kubb. Back-patch to v11 where domains over composites became a thing. Discussion: https://postgr.es/m/17515-a24737438363aca0@postgresql.org
* Revert applying column aliases to the output of whole-row Vars.Tom Lane2022-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit bf7ca1587, I had the bright idea that we could make the result of a whole-row Var (that is, foo.*) track any column aliases that had been applied to the FROM entry the Var refers to. However, that's not terribly logically consistent, because now the output of the Var is no longer of the named composite type that the Var claims to emit. bf7ca1587 tried to handle that by changing the output tuple values to be labeled with a blessed RECORD type, but that's really pretty disastrous: we can wind up storing such tuples onto disk, whereupon they're not readable by other sessions. The only practical fix I can see is to give up on what bf7ca1587 tried to do, and say that the column names of tuples produced by a whole-row Var are always those of the underlying named composite type, query aliases or no. While this introduces some inconsistencies, it removes others, so it's not that awful in the abstract. What *is* kind of awful is to make such a behavioral change in a back-patched bug fix. But corrupt data is worse, so back-patched it will be. (A workaround available to anyone who's unhappy about this is to introduce an extra level of sub-SELECT, so that the whole-row Var is referring to the sub-SELECT's output and not to a named table type. Then the Var is of type RECORD to begin with and there's no issue.) Per report from Miles Delahunty. The faulty commit dates to 9.5, so back-patch to all supported branches. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us