aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor
Commit message (Collapse)AuthorAge
* Remove Value node structPeter Eisentraut2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Value node struct is a weird construct. It is its own node type, but most of the time, it actually has a node type of Integer, Float, String, or BitString. As a consequence, the struct name and the node type don't match most of the time, and so it has to be treated specially a lot. There doesn't seem to be any value in the special construct. There is very little code that wants to accept all Value variants but nothing else (and even if it did, this doesn't provide any convenient way to check it), and most code wants either just one particular node type (usually String), or it accepts a broader set of node types besides just Value. This change removes the Value struct and node type and replaces them by separate Integer, Float, String, and BitString node types that are proper node types and structs of their own and behave mostly like normal node types. Also, this removes the T_Null node tag, which was previously also a possible variant of Value but wasn't actually used outside of the Value contained in A_Const. Replace that by an isnull field in A_Const. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
* Clean up some code using "(expr) ? true : false"Michael Paquier2021-09-08
| | | | | | | | | | All the code paths simplified here were already using a boolean or used an expression that led to zero or one, making the extra bits unnecessary. Author: Justin Pryzby Reviewed-by: Tom Lane, Michael Paquier, Peter Smith Discussion: https://postgr.es/m/20210428182936.GE27406@telsasoft.com
* Fix missing words in comment.Heikki Linnakangas2021-09-07
| | | | | | | Introduced by commit c3928b467a, backpatch to v14 like that one. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA+HiwqFQgNLS6VGntMcuJV6erBFV425xA6wBVnY=41GK4zC0Bw@mail.gmail.com
* Improve error messages about misuse of SELECT INTO.Tom Lane2021-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | Improve two places in plpgsql, and one in spi.c, where an error message would confusingly tell you that you couldn't use a SELECT query, when what you had written *was* a SELECT query. The actual problem is that you can't use SELECT ... INTO in these contexts, but the messages failed to make that apparent. Special-case SELECT INTO to make these errors more helpful. Also, fix the same spots in plpgsql, as well as several messages in exec_eval_expr(), to not quote the entire complained-of query or expression in the primary error message. That behavior very easily led to violating our message style guideline about keeping the primary error message short and single-line. Also, since the important part of the message was after the inserted text, it could make the real problem very hard to see. We can report the query or expression as the first line of errcontext instead. Per complaint from Roger Mason. Back-patch to v14, since (a) some of these messages are new in v14 and (b) v14's translatable strings are still somewhat in flux. The problem's older than that of course, but I'm hesitant to change the behavior further back. Discussion: https://postgr.es/m/1914708.1629474624@sss.pgh.pa.us
* Use appropriate tuple descriptor in FDW batchingTomas Vondra2021-08-12
| | | | | | | | | | | | | | | The FDW batching code was using the same tuple descriptor both for all slots (regular and plan slots), but that's incorrect - the subplan may use a different descriptor. Currently this is benign, because batching is used only for INSERTs, and in that case the descriptors always match. But that would change if we allow batching UPDATEs. Fix by copying the appropriate tuple descriptor. Backpatch to 14, where the FDW batching was implemented. Author: Amit Langote Backpatch-through: 14, where FDW batching was added Discussion: https://postgr.es/m/CA%2BHiwqEWd5B0-e-RvixGGUrNvGkjH2s4m95%3DJcwUnyV%3Df0rAKQ%40mail.gmail.com
* Fix segfault during EvalPlanQual with mix of local and foreign partitions.Heikki Linnakangas2021-08-12
| | | | | | | | | | | | | | | | | | | | | It's not sensible to re-evaluate a direct-modify Foreign Update or Delete during EvalPlanQual. However, ExecInitForeignScan() can still get called if a table mixes local and foreign partitions. EvalPlanQualStart() left the es_result_relations array uninitialized in the child EPQ EState, but ExecInitForeignScan() still expected to find it. That caused a segfault. Fix by skipping the es_result_relations lookup during EvalPlanQual processing. To make things a bit more robust, also skip the BeginDirectModify calls, and add a runtime check that ExecForeignScan() is not called on direct-modify foreign scans during EvalPlanQual processing. This is new in v14, commit 1375422c782. Before that, EvalPlanQualStart() copied the whole ResultRelInfo array to the EPQ EState. Backpatch to v14. Report and diagnosis by Andrey Lepikhov. Discussion: https://www.postgresql.org/message-id/cb2b808d-cbaa-4772-76ee-c8809bafcf3d%40postgrespro.ru
* Change SeqScan node to contain Scan nodePeter Eisentraut2021-08-08
| | | | | | | This makes the structure of all Scan-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
* Fix oversight in commit 1ec7fca8592178281cd5cdada0f27a340fb813fc.Etsuro Fujita2021-08-02
| | | | | | | | | | | | | | | | | I failed to account for the possibility that when ExecAppendAsyncEventWait() notifies multiple async-capable nodes using postgres_fdw, a preceding node might invoke process_pending_request() to process a pending asynchronous request made by a succeeding node. In that case the succeeding node should produce a tuple to return to the parent Append node from tuples fetched by process_pending_request() when notified. Repair. Per buildfarm via Michael Paquier. Back-patch to v14, like the previous commit. Thanks to Tom Lane for testing. Discussion: https://postgr.es/m/YQP0UPT8KmPiHTMs%40paquier.xyz
* postgres_fdw: Fix handling of pending asynchronous requests.Etsuro Fujita2021-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pending asynchronous request is handled by process_pending_request(), which previously not only processed an in-progress remote query but performed ExecForeignScan() to produce a tuple to return to the local server asynchronously from the result of the remote query. But that led to a server crash when executing a query or led to an "InstrStartNode called twice in a row" or "InstrEndLoop called on running node" failure when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it contained multiple async-capable nodes accessing the same initplan/subplan that contained multiple async-capable nodes scanning the same foreign tables as for the parent async-capable nodes, as reported by Andrey Lepikhov. The reason is that the second step in process_pending_request() invoked when executing the initplan/subplan for one of the parent async-capable nodes caused recursive execution of the initplan/subplan for another of the parent async-capable nodes. To fix, split process_pending_request() into the two steps and postpone the second step until ForeignAsyncConfigureWait() is called for each of the pending asynchronous requests. Also, in ExecAppendAsyncEventWait() we assumed that FDWs would register at least one wait event in a WaitEventSet created there when they were called from ForeignAsyncConfigureWait() in that function, but allow FDWs to register zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait() to just return in that case. Oversight in commit 27e1f1456. Back-patch to v14 where that commit went in. Andrey Lepikhov and Etsuro Fujita Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru
* Get rid of artificial restriction on hash table sizes on Windows.Tom Lane2021-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The point of introducing the hash_mem_multiplier GUC was to let users reproduce the old behavior of hash aggregation, i.e. that it could use more than work_mem at need. However, the implementation failed to get the job done on Win64, where work_mem is clamped to 2GB to protect various places that calculate memory sizes using "long int". As written, the same clamp was applied to hash_mem. This resulted in severe performance regressions for queries requiring a bit more than 2GB for hash aggregation, as they now spill to disk and there's no way to stop that. Getting rid of the work_mem restriction seems like a good idea, but it's a big job and could not conceivably be back-patched. However, there's only a fairly small number of places that are concerned with the hash_mem value, and it turns out to be possible to remove the restriction there without too much code churn or any ABI breaks. So, let's do that for now to fix the regression, and leave the larger task for another day. This patch does introduce a bit more infrastructure that should help with the larger task, namely pg_bitutils.h support for working with size_t values. Per gripe from Laurent Hasson. Back-patch to v13 where the behavior change came in. Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com
* Make nodeSort.c use Datum sorts for single column sortsDavid Rowley2021-07-22
| | | | | | | | | | | | | | | Datum sorts can be significantly faster than tuple sorts, especially when the data type being sorted is a pass-by-value type. Something in the region of 50-70% performance improvements appear to be possible. Just in case there's any confusion; the Datum sort is only used when the targetlist of the Sort node contains a single column, not when there's a single column in the sort key and multiple items in the target list. Author: Ronan Dunklau Reviewed-by: James Coleman, David Rowley, Ranier Vilela, Hou Zhijie Tested-by: John Naylor Discussion: https://postgr.es/m/3177670.itZtoPt7T5@aivenronan
* Rename NodeTag of ExprStatePeter Eisentraut2021-07-21
| | | | | | Rename from tag to type, for consistency with all other node structs. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com
* More improvements of error messages about mismatching relkindPeter Eisentraut2021-07-21
| | | | | | | | | | | Follow-up to 2ed532ee8c474e9767e76e1f3251cc3a0224358c, a few error messages in the logical replication area currently only deal with tables, but if we're anticipating more relkinds such as sequences being handled, then these messages also fall into the category affected by the previous patch, so adjust them too. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/c9ba5c6a-4bd5-e12c-1b3c-edbcaedbf392@enterprisedb.com
* Use l*_node() family of functions where appropriatePeter Eisentraut2021-07-19
| | | | | | | Instead of castNode(…, lfoo(…)) Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/87eecahraj.fsf@wibble.ilmari.org
* Change the name of the Result Cache node to MemoizeDavid Rowley2021-07-14
| | | | | | | | | | | "Result Cache" was never a great name for this node, but nobody managed to come up with another name that anyone liked enough. That was until David Johnston mentioned "Node Memoization", which Tom Lane revised to just "Memoize". People seem to like "Memoize", so let's do the rename. Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/20210708165145.GG1176@momjian.us Backpatch-through: 14, where Result Cache was introduced
* Use a hash table to speed up NOT IN(values)David Rowley2021-07-07
| | | | | | | | | | | | | | | | | | | | | Similar to 50e17ad28, which allowed hash tables to be used for IN clauses with a set of constants, here we add the same feature for NOT IN clauses. NOT IN evaluates the same as: WHERE a <> v1 AND a <> v2 AND a <> v3. Obviously, if we're using a hash table we must be exactly equivalent to that and return the same result taking into account that either side of the condition could contain a NULL. This requires a little bit of special handling to make work with the hash table version. When processing NOT IN, the ScalarArrayOpExpr's operator will be the <> operator. To be able to build and lookup a hash table we must use the <>'s negator operator. The planner checks if that exists and is hashable and sets the relevant fields in ScalarArrayOpExpr to instruct the executor to use hashing. Author: David Rowley, James Coleman Reviewed-by: James Coleman, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvoF1mum_FRk6D621edcB6KSHBi2+GAgWmioj5AhOu2vwQ@mail.gmail.com
* Allow CustomScan providers to say whether they support projections.Tom Lane2021-07-06
| | | | | | | | | | | | | | | | | Previously, all CustomScan providers had to support projections, but there may be cases where this is inconvenient. Add a flag bit to say if it's supported. Important item for the release notes: this is non-backwards-compatible since the default is now to assume that CustomScan providers can't project, instead of assuming that they can. It's fail-soft, but could result in visible performance penalties due to adding unnecessary Result nodes. Sven Klemm, reviewed by Aleksander Alekseev; some cosmetic fiddling by me. Discussion: https://postgr.es/m/CAMCrgp1kyakOz6c8aKhNDJXjhQ1dEjEnp+6KNT3KxPrjNtsrDg@mail.gmail.com
* Cleanup some aggregate code in the executorDavid Rowley2021-07-04
| | | | | | | | | | | | | | | | | | | | | Here we alter the code that calls build_pertrans_for_aggref() so that the function no longer needs to special-case whether it's dealing with an aggtransfn or an aggcombinefn. This allows us to reuse the build_aggregate_transfn_expr() function and just get rid of the build_aggregate_combinefn_expr() completely. All of the special case code that was in build_pertrans_for_aggref() has been moved up to the calling functions. This saves about a dozen lines of code in nodeAgg.c and a few dozen more in parse_agg.c Also, rename a few variables in nodeAgg.c to try to make it more clear that we're working with either a aggtransfn or an aggcombinefn. Some of the old names would have you believe that we were always working with an aggtransfn. Discussion: https://postgr.es/m/CAApHDvptMQ9FmF0D67zC_w88yVnoNVR2+kkOQGUrCmdxWxLULQ@mail.gmail.com
* Pre branch pgindent / pgperltidy runAndrew Dunstan2021-06-28
| | | | | Along the way make a slight adjustment to src/include/utils/queryjumble.h to avoid an unused typedef.
* Centralize the logic for protective copying of utility statements.Tom Lane2021-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the "simple Query" code path, it's fine for parse analysis or execution of a utility statement to scribble on the statement's node tree, since that'll just be thrown away afterwards. However it's not fine if the node tree is in the plan cache, as then it'd be corrupted for subsequent executions. Up to now we've dealt with that by having individual utility-statement functions apply copyObject() if they were going to modify the tree. But that's prone to errors of omission. Bug #17053 from Charles Samborski shows that CREATE/ALTER DOMAIN didn't get this memo, and can crash if executed repeatedly from plan cache. In the back branches, we'll just apply a narrow band-aid for that, but in HEAD it seems prudent to have a more principled fix that will close off the possibility of other similar bugs in future. Hence, let's hoist the responsibility for doing copyObject up into ProcessUtility from its children, thus ensuring that it happens for all utility statement types. Also, modify ProcessUtility's API so that its callers can tell it whether a copy step is necessary. It turns out that in all cases, the immediate caller knows whether the node tree is transient, so this doesn't involve a huge amount of code thrashing. In this way, while we lose a little bit in the execute-from-cache code path due to sometimes copying node trees that wouldn't be mutated anyway, we gain something in the simple-Query code path by not copying throwaway node trees. Statements that are complex enough to be expensive to copy are almost certainly ones that would have to be copied anyway, so the loss in the cache code path shouldn't be much. (Note that this whole problem applies only to utility statements. Optimizable statements don't have the issue because we long ago made the executor treat Plan trees as read-only. Perhaps someday we will make utility statement execution act likewise, but I'm not holding my breath.) Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
* Fix copying data into slots with FDW batchingTomas Vondra2021-06-16
| | | | | | | | | | | | | | | | Commit b676ac443b optimized handling of tuple slots with bulk inserts into foreign tables, so that the slots are initialized only once and reused for all batches. The data was however copied into the slots only after the initialization, inserting duplicate values when the slot gets reused. Fixed by moving the ExecCopySlot outside the init branch. The existing postgres_fdw tests failed to catch this due to inserting data into foreign tables without unique indexes, and then checking only the number of inserted rows. This adds a new test with both a unique index and a check of inserted values. Reported-by: Alexander Pyhalov Discussion: https://postgr.es/m/7a8cf8d56b3d18e5c0bccd6cd42d04ac%40postgrespro.ru
* Optimize creation of slots for FDW bulk insertsTomas Vondra2021-06-11
| | | | | | | | | | | | | | | | | | | | | | Commit b663a41363 introduced bulk inserts for FDW, but the handling of tuple slots turned out to be problematic for two reasons. Firstly, the slots were re-created for each individual batch. Secondly, all slots referenced the same tuple descriptor - with reasonably small batches this is not an issue, but with large batches this triggers O(N^2) behavior in the resource owner code. These two issues work against each other - to reduce the number of times a slot has to be created/dropped, larger batches are needed. However, the larger the batch, the more expensive the resource owner gets. For practical batch sizes (100 - 1000) this would not be a big problem, as the benefits (latency savings) greatly exceed the resource owner costs. But for extremely large batches it might be much worse, possibly even losing with non-batching mode. Fixed by initializing tuple slots only once (and reusing them across batches) and by using a new tuple descriptor copy for each slot. Discussion: https://postgr.es/m/ebbbcc7d-4286-8c28-0272-61b4753af761%40enterprisedb.com
* Use the correct article for abbreviationsDavid Rowley2021-06-11
| | | | | | | | | | | | | | | | | | | | | | | We've accumulated quite a mix of instances of "an SQL" and "a SQL" in the documents. It would be good to be a bit more consistent with these. The most recent version of the SQL standard I looked at seems to prefer "an SQL". That seems like a good lead to follow, so here we change all instances of "a SQL" to become "an SQL". Most instances correctly use "an SQL" already, so it also makes sense to use the dominant variation in order to minimise churn. Additionally, there were some other abbreviations that needed to be adjusted. FSM, SSPI, SRF and a few others. Also fix some pronounceable, abbreviations to use "a" instead of "an". For example, "a SASL" instead of "an SASL". Here I've only adjusted the documents and error messages. Many others still exist in source code comments. Translator hint comments seem to be the biggest culprit. It currently does not seem worth the churn to change these. Discussion: https://postgr.es/m/CAApHDvpML27UqFXnrYO1MJddsKVMQoiZisPvsAGhKE_tsKXquw%40mail.gmail.com
* Reconsider the handling of procedure OUT parameters.Tom Lane2021-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
* Shut down EvalPlanQual machinery when LockRows node reaches the end.Tom Lane2021-06-10
| | | | | | | | | | | | | | | | | | | | | Previously, we left the EPQ sub-executor alone until ExecEndLockRows. This caused any buffer pins or other resources that it might hold to remain held until ExecutorEnd, which in some code paths means that they are held till the Portal is closed. That can cause user-visible problems, such as blocking VACUUM; and it's unlike the behavior of ordinary table-scanning nodes, which will have released all buffer pins by the time they return an EOF indication. We can make LockRows work more like other plan nodes by calling EvalPlanQualEnd just before returning NULL. We still need to call it in ExecEndLockRows in case the node was not run to completion, but in the normal case the second call does nothing and costs little. Per report from Yura Sokolov. In principle this is a longstanding bug, but in view of the lack of other complaints and the low severity of the consequences, I chose not to back-patch. Discussion: https://postgr.es/m/4aa370cb91ecf2f9885d98b80ad1109c@postgrespro.ru
* Fix rescanning of async-aware Append nodes.Etsuro Fujita2021-06-07
| | | | | | | | | | | | | | | | | | | | | In cases where run-time pruning isn't required, the synchronous and asynchronous subplans for an async-aware Append node determined using classify_matching_subplans() should be re-used when rescanning the node, but the previous code re-determined them using that function repeatedly each time when rescanning the node, leading to incorrect results in a normal build and an Assert failure in an Assert-enabled build as that function doesn't assume that it's called repeatedly in such cases. Fix the code as mentioned above. My oversight in commit 27e1f1456. While at it, initialize async-related pointers/variables to NULL/zero explicitly in ExecInitAppend() and ExecReScanAppend(), just to be sure. (The variables would have been set to zero before we get to the latter function, but let's do so.) Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/CAPmGK16Q4B2_KY%2BJH7rb7wQbw54AUprp7TMekGTd2T1B62yysQ%40mail.gmail.com
* Fix planner's use of Result Cache with unique joinsDavid Rowley2021-05-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the planner considered using a Result Cache node to cache results from the inner side of a Nested Loop Join, it failed to consider that the inner path's parameterization may not be the entire join condition. If the join was marked as inner_unique then we may accidentally put the cache in singlerow mode. This meant that entries would be marked as complete after caching the first row. That was wrong as if only part of the join condition was parameterized then the uniqueness of the unique join was not guaranteed at the Result Cache's level. The uniqueness is only guaranteed after Nested Loop applies the join filter. If subsequent rows were found, this would lead to: ERROR: cache entry already complete This could have been fixed by only putting the cache in singlerow mode if the entire join condition was parameterized. However, Nested Loop will only read its inner side so far as the first matching row when the join is unique, so that might mean we never get an opportunity to mark cache entries as complete. Since non-complete cache entries are useless for subsequent lookups, we just don't bother considering a Result Cache path in this case. In passing, remove the XXX comment that claimed the above ERROR might be better suited to be an Assert. After there being an actual case which triggered it, it seems better to keep it an ERROR. Reported-by: David Christensen Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
* Fix usage of "tableoid" in GENERATED expressions.Tom Lane2021-05-21
| | | | | | | | | | | | | | | We consider this supported (though I've got my doubts that it's a good idea, because tableoid is not immutable). However, several code paths failed to fill the field in soon enough, causing such a GENERATED expression to see zero or the wrong value. This occurred when ALTER TABLE adds a new GENERATED column to a table with existing rows, and during regular INSERT or UPDATE on a foreign table with GENERATED columns. Noted during investigation of a report from Vitaly Ustinov. Back-patch to v12 where GENERATED came in. Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
* Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.Tom Lane2021-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
* Convert misleading while loop into an if conditionDavid Rowley2021-05-14
| | | | | | | | | | | | | | This seems to be leftover from ea15e1867 and from when we used to evaluate SRFs at each node. Since there is an unconditional "return" at the end of the loop body, only 1 loop is ever possible, so we can just change this into an if condition. There is no actual bug being fixed here so no back-patch. It seems fine to just fix this anomaly in master only. Author: Greg Nancarrow Discussion: https://postgr.es/m/CAJcOf-d7T1q0az-D8evWXnsuBZjigT04WkV5hCAOEJQZRWy28w@mail.gmail.com
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Fix EXPLAIN ANALYZE for async-capable nodes.Etsuro Fujita2021-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | EXPLAIN ANALYZE for an async-capable ForeignScan node associated with postgres_fdw is done just by using instrumentation for ExecProcNode() called from the node's callbacks, causing the following problems: 1) If the remote table to scan is empty, the node is incorrectly considered as "never executed" by the command even if the node is executed, as ExecProcNode() isn't called from the node's callbacks at all in that case. 2) The command fails to collect timings for things other than ExecProcNode() done in the node, such as creating a cursor for the node's remote query. To fix these problems, add instrumentation for async-capable nodes, and modify postgres_fdw accordingly. My oversight in commit 27e1f1456. While at it, update a comment for the AsyncRequest struct in execnodes.h and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix typos in comments in nodeAppend.c. Per report from Andrey Lepikhov, though I didn't use his patch. Reviewed-by: Andrey Lepikhov Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
* Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present. If it happens, the ON CONFLICT UPDATE code path would end up storing tuples that include the values of the extra resjunk columns. That's fairly harmless in the short run, but if new columns are added to the table then the values would become accessible, possibly leading to malfunctions if they don't match the datatypes of the new columns. This had escaped notice through a confluence of missing sanity checks, including * There's no cross-check that a tuple presented to heap_insert or heap_update matches the table rowtype. While it's difficult to check that fully at reasonable cost, we can easily add assertions that there aren't too many columns. * The output-column-assignment cases in execExprInterp.c lacked any sanity checks on the output column numbers, which seems like an oversight considering there are plenty of assertion checks on input column numbers. Add assertions there too. * We failed to apply nodeModifyTable's ExecCheckPlanOutput() to the ON CONFLICT UPDATE tlist. That wouldn't have caught this specific error, since that function is chartered to ignore resjunk columns; but it sure seems like a bad omission now that we've seen this bug. In HEAD, the right way to fix this is to make the processing of ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists now do, that is don't add "SET x = x" entries, and use ExecBuildUpdateProjection to evaluate the tlist and combine it with old values of the not-set columns. This adds a little complication to ExecBuildUpdateProjection, but allows removal of a comparable amount of now-dead code from the planner. In the back branches, the most expedient solution seems to be to (a) use an output slot for the ON CONFLICT UPDATE projection that actually matches the target table, and then (b) invent a variant of ExecBuildProjectionInfo that can be told to not store values resulting from resjunk columns, so it doesn't try to store into nonexistent columns of the output slot. (We can't simply ignore the resjunk columns altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.) This works back to v10. In 9.6, projections work much differently and we can't cheaply give them such an option. The 9.6 version of this patch works by inserting a JunkFilter when it's necessary to get rid of resjunk columns. In addition, v11 and up have the reverse problem when trying to perform ON CONFLICT UPDATE on a partitioned table. Through a further oversight, adjust_partition_tlist() discarded resjunk columns when re-ordering the ON CONFLICT UPDATE tlist to match a partition. This accidentally prevented the storing-bogus-tuples problem, but at the cost that MULTIEXPR_SUBLINK cases didn't work, typically crashing if more than one row has to be updated. Fix by preserving resjunk columns in that routine. (I failed to resist the temptation to add more assertions there too, and to do some minor code beautification.) Per report from Andres Freund. Back-patch to all supported branches. Security: CVE-2021-32028
* Prevent integer overflows in array subscripting calculations.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | While we were (mostly) careful about ensuring that the dimensions of arrays aren't large enough to cause integer overflow, the lower bound values were generally not checked. This allows situations where lower_bound + dimension overflows an integer. It seems that that's harmless so far as array reading is concerned, except that array elements with subscripts notionally exceeding INT_MAX are inaccessible. However, it confuses various array-assignment logic, resulting in a potential for memory stomps. Fix by adding checks that array lower bounds aren't large enough to cause lower_bound + dimension to overflow. (Note: this results in disallowing cases where the last subscript position would be exactly INT_MAX. In principle we could probably allow that, but there's a lot of code that computes lower_bound + dimension and would need adjustment. It seems doubtful that it's worth the trouble/risk to allow it.) Somewhat independently of that, array_set_element() was careless about possible overflow when checking the subscript of a fixed-length array, creating a different route to memory stomps. Fix that too. Security: CVE-2021-32027
* Move memory accounting Asserts for Result Cache codeDavid Rowley2021-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 9eacee2e6, I included some code to verify the cache's memory tracking is correct by counting up the number of entries and the memory they use each time we evict something from the cache. Those values are then compared to the expected values using Assert. The problem is that this requires looping over the entire cache hash table each time we evict an entry from the cache. That can be pretty expensive, as noted by Pavel Stehule. Here we move this memory accounting checking code so that we only verify it on cassert builds once when shutting down the Result Cache node. Aside from the performance increase, this has two distinct advantages: 1) We do the memory checks at the last possible moment before destroying the cache. This means we'll now catch accounting problems that might sneak in after a cache eviction. 2) We now do the memory Assert checks when there were no cache evictions. This increases the coverage. One small disadvantage is that we'll now miss any memory tracking issues that somehow managed to resolve themselves by the end of execution. However, it seems to me that such a memory tracking problem would be quite unlikely, and likely somewhat less harmful if one were to exist. In passing, adjust the loop over the hash table to use the standard simplehash.h method of iteration. Reported-by: Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRAzgoSkdEiqrKbT=7yG9FA5fjUAP3jmJywuDqYq6Ki5ug@mail.gmail.com
* Fix come comments in execMain.cMichael Paquier2021-04-24
| | | | | | | | | 1375422 has refactored this area of the executor code, and some comments went out-of-sync. Author: Yukun Wang Reviewed-by: Amul Sul Discussion: https://postgr.es/m/OS0PR01MB60033394FCAEF79B98F078F5B4459@OS0PR01MB6003.jpnprd01.prod.outlook.com
* Minor code cleanup in asynchronous execution support.Etsuro Fujita2021-04-23
| | | | | | | | | | | | | | | This is cleanup for commit 27e1f1456: * ExecAppendAsyncEventWait(), which was modified a bit further by commit a8af856d3, duplicated the same nevents calculation. Simplify the code a little bit to avoid the duplication. Update comments there. * Add an assertion to ExecAppendAsyncRequest(). * Update a comment about merging the async_capable options from input relations in merge_fdw_options(), per complaint from Kyotaro Horiguchi. * Add a comment for fetch_more_data_begin(). Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK1637W30Wx3MnrReewhafn6F_0J76mrJGoFXFnpPq4QfvA%40mail.gmail.com
* Don't crash on reference to an un-available system column.Tom Lane2021-04-22
| | | | | | | | | | | | | | | | Adopt a more consistent policy about what slot-type-specific getsysattr functions should do when system attributes are not available. To wit, they should all throw the same user-oriented error, rather than variously crashing or emitting developer-oriented messages. This closes a identifiable problem in commits a71cfc56b and 3fb93103a (in v13 and v12), so back-patch into those branches, along with a test case to try to ensure we don't break it again. It is not known that any of the former crash cases are reachable in HEAD, but this seems like a good safety improvement in any case. Discussion: https://postgr.es/m/141051591267657@mail.yandex.ru
* Fix relcache inconsistency hazard in partition detachAlvaro Herrera2021-04-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During queries coming from ri_triggers.c, we need to omit partitions that are marked pending detach -- otherwise, the RI query is tricked into allowing a row into the referencing table whose corresponding row is in the detached partition. Which is bogus: once the detach operation completes, the row becomes an orphan. However, the code was not doing that in repeatable-read transactions, because relcache kept a copy of the partition descriptor that included the partition, and used it in the RI query. This commit changes the partdesc cache code to only keep descriptors that aren't dependent on a snapshot (namely: those where no detached partition exist, and those where detached partitions are included). When a partdesc-without- detached-partitions is requested, we create one afresh each time; also, those partdescs are stored in PortalContext instead of CacheMemoryContext. find_inheritance_children gets a new output *detached_exist boolean, which indicates whether any partition marked pending-detach is found. Its "include_detached" input flag is changed to "omit_detached", because that name captures desired the semantics more naturally. CreatePartitionDirectory() and RelationGetPartitionDesc() arguments are identically renamed. This was noticed because a buildfarm member that runs with relcache clobbering, which would not keep the improperly cached partdesc, broke one test, which led us to realize that the expected output of that test was bogus. This commit also corrects that expected output. Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3269784.1617215412@sss.pgh.pa.us
* doc: Improve hyphenation consistencyPeter Eisentraut2021-04-21
|
* adjust query id feature to use pg_stat_activity.query_idBruce Momjian2021-04-20
| | | | | | | | | | | Previously, it was pg_stat_activity.queryid to match the pg_stat_statements queryid column. This is an adjustment to patch 4f0b0966c8. This also adjusts some of the internal function calls to match. Catversion bumped. Reported-by: Álvaro Herrera, Julien Rouhaud Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql
* Fix typos and grammar in comments and docsMichael Paquier2021-04-19
| | | | | Author: Justin Pryzby Discussion: https://postgr.es/m/20210416070310.GG3315@telsasoft.com
* Revert "Cope with NULL query string in ExecInitParallelPlan()."Tom Lane2021-04-15
| | | | | | | | This reverts commit b3ee4c503872f3d0a5d6a7cbde48815f555af15b. We don't need it in the wake of the preceding commit, which added an upstream check that the querystring isn't null. Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us
* Undo decision to allow pg_proc.prosrc to be NULL.Tom Lane2021-04-15
| | | | | | | | | | | | | | | | | | | | | | | | Commit e717a9a18 changed the longstanding rule that prosrc is NOT NULL because when a SQL-language function is written in SQL-standard style, we don't currently have anything useful to put there. This seems a poor decision though, as it could easily have negative impacts on external PLs (opening them to crashes they didn't use to have, for instance). SQL-function-related code can just as easily test "is prosqlbody not null" as "is prosrc null", so there's no real gain there either. Hence, revert the NOT NULL marking removal and adjust related logic. For now, we just put an empty string into prosrc for SQL-standard functions. Maybe we'll have a better idea later, although the history of things like pg_attrdef.adsrc suggests that it's not easy to maintain a string equivalent of a node tree. This also adds an assertion that queryDesc->sourceText != NULL to standard_ExecutorStart. We'd been silently relying on that for awhile, so let's make it less silent. Also fix some overlooked documentation and test cases. Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us
* Redesign the caching done by get_cached_rowtype().Tom Lane2021-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, get_cached_rowtype() cached a pointer to a reference-counted tuple descriptor from the typcache, relying on the ExprContextCallback mechanism to release the tupdesc refcount when the expression tree using the tupdesc was destroyed. This worked fine when it was designed, but the introduction of within-DO-block COMMITs broke it. The refcount is logged in a transaction-lifespan resource owner, but plpgsql won't destroy simple expressions made within the DO block (before its first commit) until the DO block is exited. That results in a warning about a leaked tupdesc refcount when the COMMIT destroys the original resource owner, and then an error about the active resource owner not holding a matching refcount when the expression is destroyed. To fix, get rid of the need to have a shutdown callback at all, by instead caching a pointer to the relevant typcache entry. Those survive for the life of the backend, so we needn't worry about the pointer becoming stale. (For registered RECORD types, we can still cache a pointer to the tupdesc, knowing that it won't change for the life of the backend.) This mechanism has been in use in plpgsql and expandedrecord.c since commit 4b93f5799, and seems to work well. This change requires modifying the ExprEvalStep structs used by the relevant expression step types, which is slightly worrisome for back-patching. However, there seems no good reason for extensions to be familiar with the details of these particular sub-structs. Per report from Rohit Bhogate. Back-patch to v11 where within-DO-block COMMITs became a thing. Discussion: https://postgr.es/m/CAAV6ZkQRCVBh8qAY+SZiHnz+U+FqAGBBDaDTjF2yiKa2nJSLKg@mail.gmail.com
* Fixes for query_id featureBruce Momjian2021-04-08
| | | | | | | | | | | | | | | | | | Ignore parallel workers in pg_stat_statements Oversight in 4f0b0966c8 which exposed queryid in parallel workers. Counters are aggregated by the main backend process so parallel workers would report duplicated activity, and could also report activity for the wrong entry as they are only aware of the top level queryid. Fix thinko in pg_stat_get_activity when retrieving the queryid. Remove unnecessary call to pgstat_report_queryid(). Reported-by: Amit Kapila, Andres Freund, Thomas Munro Discussion: https://postgr.es/m/20210408051735.lfbdzun5zdlax5gd@alap3.anarazel.de p634GTSOqnDW86Owrn6qDAVosC5dJjXjp7BMfc5Gz1Q@mail.gmail.com Author: Julien Rouhaud
* Speedup ScalarArrayOpExpr evaluationDavid Rowley2021-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ScalarArrayOpExprs with "useOr=true" and a set of Consts on the righthand side have traditionally been evaluated by using a linear search over the array. When these arrays contain large numbers of elements then this linear search could become a significant part of execution time. Here we add a new method of evaluating ScalarArrayOpExpr expressions to allow them to be evaluated by first building a hash table containing each element, then on subsequent evaluations, we just probe that hash table to determine if there is a match. The planner is in charge of determining when this optimization is possible and it enables it by setting hashfuncid in the ScalarArrayOpExpr. The executor will only perform the hash table evaluation when the hashfuncid is set. This means that not all cases are optimized. For example CHECK constraints containing an IN clause won't go through the planner, so won't get the hashfuncid set. We could maybe do something about that at some later date. The reason we're not doing it now is from fear that we may slow down cases where the expression is evaluated only once. Those cases can be common, for example, a single row INSERT to a table with a CHECK constraint containing an IN clause. In the planner, we enable this when there are suitable hash functions for the ScalarArrayOpExpr's operator and only when there is at least MIN_ARRAY_SIZE_FOR_HASHED_SAOP elements in the array. The threshold is currently set to 9. Author: James Coleman, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/CAAaqYe8x62+=wn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA@mail.gmail.com
* Cope with NULL query string in ExecInitParallelPlan().Andres Freund2021-04-07
| | | | | | | | | It's far from clear that this is the right approach - but a good portion of the buildfarm has been red for a few hours, on the last day of the CF. And this fixes at least the obvious crash. So let's go with that for now. Discussion: https://postgr.es/m/20210407225806.majgznh4lk34hjvu%40alap3.anarazel.de
* SQL-standard function bodyPeter Eisentraut2021-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
* Make use of in-core query id added by commit 5fd9dfa5f5Bruce Momjian2021-04-07
| | | | | | | | | | | | | | | | | | | | | | | Use the in-core query id computation for pg_stat_activity, log_line_prefix, and EXPLAIN VERBOSE. Similar to other fields in pg_stat_activity, only the queryid from the top level statements are exposed, and if the backends status isn't active then the queryid from the last executed statements is displayed. Add a %Q placeholder to include the queryid in log_line_prefix, which will also only expose top level statements. For EXPLAIN VERBOSE, if a query identifier has been computed, either by enabling compute_query_id or using a third-party module, display it. Bump catalog version. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu