aboutsummaryrefslogtreecommitdiff
path: root/src/include/executor
Commit message (Collapse)AuthorAge
...
* Add local_blk_{read|write}_time I/O timing statistics for local blocksMichael Paquier2023-10-19
| | | | | | | | | | | | | There was no I/O timing statistics for counting read and write timings on local blocks, contrary to the counterparts for temp and shared blocks. This information is available when track_io_timing is enabled. The output of EXPLAIN is updated to show this information. An update of pg_stat_statements is planned next. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
* Rename I/O timing statistics columns to shared_blk_{read|write}_timeMichael Paquier2023-10-19
| | | | | | | | | | | | | | | | | | | | These two counters, defined in BufferUsage to track respectively the time spent while reading and writing blocks have historically only tracked data related to shared buffers, when track_io_timing is enabled. An upcoming patch to add specific counters for local buffers will take advantage of this rename as it has come up that no data is currently tracked for local buffers, and tracking local and shared buffers using the same fields would be inconsistent with the treatment done for temp buffers. Renaming the existing fields clarifies what the block type of each stats field is. pg_stat_statement is updated to reflect the rename. No extension version bump is required as 5a3423ad8ee17 has done one, affecting v17~. Author: Nazir Bilal Yavuz Reviewed-by: Robert Haas, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
* Add const to values and nulls argumentsPeter Eisentraut2023-10-10
| | | | | | | This excludes any changes that would change the external AM APIs. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/14c31f4a-0347-0805-dce8-93a9072c05a5%40eisentraut.org
* Revert "Add soft error handling to some expression nodes"Amit Langote2023-10-02
| | | | | | This reverts commit 7fbc75b26ed8ec70c729c5e7f8233896c54c900f. Looks like the LLVM additions may not be totally correct.
* Add soft error handling to some expression nodesAmit Langote2023-10-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | This adjusts the expression evaluation code for CoerceViaIO and CoerceToDomain to handle errors softly if needed. For CoerceViaIo, this means using InputFunctionCallSafe(), which provides the option to handle errors softly, instead of calling the type input function directly. For CoerceToDomain, this simply entails replacing the ereport() in ExecEvalConstraintCheck() by errsave(). In both cases, the ErrorSaveContext to be used when evaluating the expression is stored by ExecInitExprRec() in the expression's struct in the expression's ExprEvalStep. The ErrorSaveContext is passed by setting ExprState.escontext to point to it when calling ExecInitExprRec() on the expression whose errors are to be handled softly. Note that no call site of ExecInitExprRec() has been changed in this commit, so there's no functional change. This is intended for implementing new SQL/JSON expression nodes in future commits that will use to it suppress errors that may occur during type coercions. Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* Remove obsolete executor cleanup codeAmit Langote2023-09-28
| | | | | | | | | | | | | | | | | | | This commit removes unnecessary ExecExprFreeContext() calls in ExecEnd* routines because the actual cleanup is managed by FreeExecutorState(). With no callers remaining for ExecExprFreeContext(), this commit also removes the function. This commit also drops redundant ExecClearTuple() calls, because ExecResetTupleTable() in ExecEndPlan() already takes care of resetting and dropping all TupleTableSlots initialized with ExecInitScanTupleSlot() and ExecInitExtraTupleSlot(). After these modifications, the ExecEnd*() routines for ValuesScan, NamedTuplestoreScan, and WorkTableScan became redundant. So, this commit removes them. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
* Allow the use of a hash index on the subscriber during replication.Amit Kapila2023-07-14
| | | | | | | | | | | | | | | Commit 89e46da5e5 allowed using BTREE indexes that are neither PRIMARY KEY nor REPLICA IDENTITY on the subscriber during apply of update/delete. This patch extends that functionality to also allow HASH indexes. We explored supporting other index access methods as well but they don't have a fixed strategy for equality operation which is required by the current infrastructure in logical replication to scan the indexes. Author: Kuroda Hayato Reviewed-by: Peter Smith, Onder Kalaci, Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB58669D7414E59664E17A5827F522A@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Remove obsolete commentPeter Eisentraut2023-06-05
| | | | OIDs are no longer system columns, since 578b229718.
* Pre-beta mechanical code beautification.Tom Lane2023-05-19
| | | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
* Fix misbehavior of EvalPlanQual checks with multiple result relations.Tom Lane2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit 86dc90056 made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of 3f7836ff6 and 4b3e37993, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where 86dc90056 came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
* Allocate hash join files in a separate memory contextTomas Vondra2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Should a hash join exceed memory limit, the hashtable is split up into multiple batches. The number of batches is doubled each time a given batch is determined not to fit in memory. Each batch file is allocated with a block-sized buffer for buffering tuples and parallel hash join has additional sharedtuplestore accessor buffers. In some pathological cases requiring a lot of batches, often with skewed data, bad stats, or very large datasets, users can run out-of-memory solely from the memory overhead of all the batch files' buffers. Batch files were allocated in the ExecutorState memory context, making it very hard to identify when this batch explosion was the source of an OOM. This commit allocates the batch files in a dedicated memory context, making it easier to identify the cause of an OOM and work to avoid it. Based on initial draft by Tomas Vondra, with significant reworks and improvements by Jehan-Guillaume de Rorthais. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Author: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20190421114618.z3mpgmimc3rmubi4@development Discussion: https://postgr.es/m/20230504193006.1b5b9622%40karst#273020ff4061fc7a2fbb1ba96b281f17
* Add back SQLValueFunction for SQL keywordsMichael Paquier2023-05-17
| | | | | | | | | | | | | | | | | | | | | | | | This is equivalent to a revert of f193883 and fb32748, with the addition that the declaration of the SQLValueFunction node needs to gain a couple of node_attr for query jumbling. The performance impact of removing the function call inlining is proving to be too huge for some workloads where these are used. A worst-case test case of involving only simple SELECT queries with a SQL keyword is proving to lead to a reduction of 10% in TPS via pgbench and prepared queries on a high-end machine. None of the tests I ran back for this set of changes saw such a huge gap, but Alexander Lakhin and Andres Freund have found that this can be noticeable. Keeping the older performance would mean to do more inlining in the executor when using COERCE_SQL_SYNTAX for a function expression, similarly to what SQLValueFunction does. This requires more redesign work and there is little time until 16beta1 is released, so for now reverting the change is the best way forward, bringing back the previous performance. Bump catalog version. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com
* Revert "Move PartitionPruneInfo out of plan nodes into PlannedStmt"Alvaro Herrera2023-05-04
| | | | | | | | | | | | | This reverts commit ec386948948c and its fixup 589bb816499e. This change was intended to support query planning avoiding acquisition of locks on partitions that were going to be pruned; however, the overall project took a different direction at [1] and this bit is no longer needed. Put things back the way they were as agreed in [2], to avoid unnecessary complexity. Discussion: [1] https://postgr.es/m/4191508.1674157166@sss.pgh.pa.us Discussion: [2] https://postgr.es/m/20230502175409.kcoirxczpdha26wt@alvherre.pgsql
* Rename ExecAggTransReparent, and improve its documentation.Tom Lane2023-04-24
| | | | | | | | | | | | | The name of this function suggests that it ought to reparent R/W expanded objects to be children of the persistent aggcontext, instead of copying them. In fact it does no such thing, and if you try to make it do so you will see multiple regression failures. Rename it to the less-misleading ExecAggCopyTransValue, and add commentary about why that attractive-sounding optimization won't work. Also adjust comments at call sites, some of which were describing logic that has since been moved into ExecAggCopyTransValue. Discussion: https://postgr.es/m/3004282.1681930251@sss.pgh.pa.us
* Revert 11470f544eAlexander Korotkov2023-04-03
| | | | Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
* SQL/JSON: support the IS JSON predicateAlvaro Herrera2023-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the SQL standard IS JSON predicate. It operates on text and bytea values representing JSON, as well as on the json and jsonb types. Each test has IS and IS NOT variants and supports a WITH UNIQUE KEYS flag. The tests are: IS JSON [VALUE] IS JSON ARRAY IS JSON OBJECT IS JSON SCALAR These should be self-explanatory. The WITH UNIQUE KEYS flag makes these return false when duplicate keys exist in any object within the value, not necessarily directly contained in the outermost object. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
* Move ExecEvalJsonConstructor new function to a more natural placeAlvaro Herrera2023-03-31
| | | | | Commit 7081ac46ace8 put it at the end of the file, but that doesn't look very nice.
* Parallel Hash Full Join.Thomas Munro2023-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | Full and right outer joins were not supported in the initial implementation of Parallel Hash Join because of deadlock hazards (see discussion). Therefore FULL JOIN inhibited parallelism, as the other join strategies can't do that in parallel either. Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on the inner side of one batch's hash table. For now, sidestep the deadlock problem by terminating parallelism there. The last process to arrive at that phase emits the unmatched tuples, while others detach and are free to go and work on other batches, if there are any, but otherwise they finish the join early. That unfairness is considered acceptable for now, because it's better than no parallelism at all. The build and probe phases are run in parallel, and the new scan-for-unmatched phase, while serial, is usually applied to the smaller of the two relations and is either limited by some multiple of work_mem, or it's too big and is partitioned into batches and then the situation is improved by batch-level parallelism. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
* Fix outdated comments regarding TupleTableSlotsDavid Rowley2023-03-30
| | | | | | | | | | | | The tts_flag is named TTS_FLAG_SHOULDFREE, so use that instead of TTS_SHOULDFREE, which is the name of the macro that checks for that flag. Additionally, 4da597edf got rid of the TupleTableSlot.tts_tuple field but forgot to update a comment which referenced that field. Fix that. Reported-by: Zhen Mingyang <zhenmingyang@yeah.net> Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1a96696c.9d3.187193989c3.Coremail.zhenmingyang@yeah.net
* SQL/JSON: add standard JSON constructor functionsAlvaro Herrera2023-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces the SQL/JSON standard-conforming constructors for JSON types: JSON_ARRAY() JSON_ARRAYAGG() JSON_OBJECT() JSON_OBJECTAGG() Most of the functionality was already present in PostgreSQL-specific functions, but these include some new functionality such as the ability to skip or include NULL values, and to allow duplicate keys or throw error when they are found, as well as the standard specified syntax to specify output type and format. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
* Invent GENERIC_PLAN option for EXPLAIN.Tom Lane2023-03-24
| | | | | | | | | | | | | | | | | This provides a very simple way to see the generic plan for a parameterized query. Without this, it's necessary to define a prepared statement and temporarily change plan_cache_mode, which is a bit tedious. One thing that's a bit of a hack perhaps is that we disable execution-time partition pruning when the GENERIC_PLAN option is given. That's because the pruning code may attempt to fetch the value of one of the parameters, which would fail. Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg, Michel Pelletier, Jim Jones, and myself Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
* Improve the naming of Parallel Hash Join phases.Thomas Munro2023-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | * Commit 3048898e dropped -ING from PHJ wait event names. Update the corresponding barrier phases names to match. * Rename the "DONE" phases to "FREE". That's symmetrical with "ALLOCATE", and names the activity that actually happens in that phase (as we do for the other phases) rather than a state. The bug fixed by commit 8d578b9b might have been more obvious with this name. * Rename the batch/bucket growth barriers' "ALLOCATE" phases to "REALLOCATE", a better description of what they do. * Update the high level comments about phases to highlight phases are executed by a single process with an asterisk (mostly memory management phases). No behavior change, as this is just improving internal identifiers. The only user-visible sign of this is that a couple of wait events' display names change from "...Allocate" to "...Reallocate" in pg_stat_activity, to stay in sync with the internal names. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
* Allow locking updated tuples in tuple_update() and tuple_delete()Alexander Korotkov2023-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, in read committed transaction isolation mode (default), we have the following sequence of actions when tuple_update()/tuple_delete() finds the tuple updated by concurrent transaction. 1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which returns TM_Updated. 2. Lock tuple with tuple_lock(). 3. Re-evaluate plan qual (recheck if we still need to update/delete and calculate the new tuple for update). 4. Second attempt to update/delete tuple with tuple_update()/tuple_delete(). This attempt should be successful, since the tuple was previously locked. This patch eliminates step 2 by taking the lock during first tuple_update()/tuple_delete() call. Heap table access method saves some efforts by checking the updated tuple once instead of twice. Future undo-based table access methods, which will start from the latest row version, can immediately place a lock there. The code in nodeModifyTable.c is simplified by removing the nested switch/case. Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp Reviewed-by: Andres Freund, Chris Travers
* Fix race in parallel hash join batch cleanup, take II.Thomas Munro2023-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit 3b8981b6, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <thomas.munro@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: David Geier <geidav.pg@gmail.com> Tested-by: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
* Ignore BRIN indexes when checking for HOT updatesTomas Vondra2023-03-20
| | | | | | | | | | | | | | | | | | | | | | | | When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed by block summarizing indexes without references to individual tuples that need to be cleaned up. A new type TU_UpdateIndexes provides a signal to the executor to determine which indexes to update - no indexes, all indexes, or only the summarizing indexes. This also removes rd_indexattr list, and replaces it with rd_attrsvalid flag. The list was not used anywhere, and a simple flag is sufficient. This was originally committed as 5753d4ee32, but then got reverted by e3fcca0d0d because of correctness issues. Original patch by Josef Simanek, various fixes and improvements by Tomas Vondra and me. Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra Reviewed-by: Tomas Vondra, Alvaro Herrera Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
* Fill EState.es_rteperminfos more systematically.Tom Lane2023-03-06
| | | | | | | | | | | | | | | | | | While testing a fix for bug #17823, I discovered that EvalPlanQualStart failed to copy es_rteperminfos from the parent EState, resulting in failure if anything in EPQ execution wanted to consult that information. This led me to conclude that commit a61b1f748 had been too haphazard about where to fill es_rteperminfos, and that we need to be sure that that happens exactly where es_range_table gets filled. So I changed the signature of ExecInitRangeTable to help ensure that this new requirement doesn't get missed. (Indeed, pgoutput.c was also failing to fill it. Maybe we don't ever need it there, but I wouldn't bet on that.) No test case yet; one will arrive with the fix for #17823. But that needs to be back-patched, while this fix is HEAD-only. Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org
* Add static assertion ensuring sizeof(ExprEvalStep) <= 64 bytesAndres Freund2023-02-22
| | | | | | | | | | | | This was previously only documented in a comment. Given the size of the struct, it's not hard to miss that comment. As evidenced by the commits leading up to fe3caa14393, 67b26703b41. It's possible, but not likely, that we might have to weaken these assertions on a less commonly used architecture. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/295606.1677101684@sss.pgh.pa.us
* Make new GENERATED-expressions code more bulletproof.Tom Lane2023-01-15
| | | | | | | | | | | | | | | | | In commit 8bf6ec3ba I assumed that no code path could reach ExecGetExtraUpdatedCols without having gone through ExecInitStoredGenerated. That turns out not to be the case in logical replication: if there's an ON UPDATE trigger on the target table, trigger.c will call this code before anybody has set up its generated columns. Having seen that, I don't have a lot of faith in there not being other such paths. ExecGetExtraUpdatedCols can call ExecInitStoredGenerated for itself, as long as we are willing to assume that it is only called in CMD_UPDATE operations, which on the whole seems like a safer leap of faith. Per report from Vitaly Davydov. Discussion: https://postgr.es/m/d259d69652b8c2ff50e14cda3c236c7f@postgrespro.ru
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Rework query relation permission checkingAlvaro Herrera2022-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
* Generalize ri_RootToPartitionMap to use for non-partition childrenAlvaro Herrera2022-12-02
| | | | | | | | | | | | | | | | | | | | | | | ri_RootToPartitionMap is currently only initialized for tuple routing target partitions, though a future commit will need the ability to use it even for the non-partition child tables, so make adjustments to the decouple it from the partitioning code. Also, make it lazily initialized via ExecGetRootToChildMap(), making that function its preferred access path. Existing third-party code accessing it directly should no longer do so; consequently, it's been renamed to ri_RootToChildMap, which also makes it consistent with ri_ChildToRootMap. ExecGetRootToChildMap() houses the logic of setting the map appropriately depending on whether a given child relation is partition or not. To support this, also add a separate entry point for TupleConversionMap creation that receives an AttrMap. No new code here, just split an existing function in two. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqEYUhDXSK5BTvG_xk=eaAEJCD4GS3C6uH7ybBvv+Z_Tmg@mail.gmail.com
* Move PartitioPruneInfo out of plan nodes into PlannedStmtAlvaro Herrera2022-12-01
| | | | | | | | | | | | | | | | | | | | The planner will now add a given PartitioPruneInfo to PlannedStmt.partPruneInfos instead of directly to the Append/MergeAppend plan node. What gets set instead in the latter is an index field which points to the list element of PlannedStmt.partPruneInfos containing the PartitioPruneInfo belonging to the plan node. A later commit will make AcquireExecutorLocks() do the initial partition pruning to determine a minimal set of partitions to be locked when validating a plan tree and it will need to consult the PartitioPruneInfos referenced therein to do so. It would be better for the PartitioPruneInfos to be accessible directly than requiring a walk of the plan tree to find them, which is easier when it can be done by simply iterating over PlannedStmt.partPruneInfos. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
* Replace SQLValueFunction by COERCE_SQL_SYNTAXMichael Paquier2022-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This switch impacts 9 patterns related to a SQL-mandated special syntax for function calls: - LOCALTIME [ ( typmod ) ] - LOCALTIMESTAMP [ ( typmod ) ] - CURRENT_TIME [ ( typmod ) ] - CURRENT_TIMESTAMP [ ( typmod ) ] - CURRENT_DATE Five new entries are added to pg_proc to compensate the removal of SQLValueFunction to provide backward-compatibility and making this change transparent for the end-user (for example for the attribute generated when a keyword is specified in a SELECT or in a FROM clause without an alias, or when specifying something else than an Iconst to the parser). The parser included a set of checks coming from the files in charge of holding the C functions used for the SQLValueFunction calls (as of transformSQLValueFunction()), which are now moved within each function's execution path, so this reduces the dependencies between the execution and the parsing steps. As of this change, all the SQL keywords use the same paths for their work, relying only on COERCE_SQL_SYNTAX. Like fb32748, no performance difference has been noticed, while the perf profiles get reduced with ExecEvalSQLValueFunction() gone. Bump catalog version. Reviewed-by: Corey Huinker, Ted Yu Discussion: https://postgr.es/m/YzaG3MoryCguUOym@paquier.xyz
* Remove AssertArg and AssertStatePeter Eisentraut2022-10-28
| | | | | | | | | These don't offer anything over plain Assert, and their usage had already been declared obsolescent. Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20221009210148.GA900071@nathanxps13
* Harmonize parameter names in storage and AM code.Peter Geoghegan2022-09-19
| | | | | | | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
* Future-proof the recursion inside ExecShutdownNode().Tom Lane2022-09-19
| | | | | | | | | | | | | | | | | | | | | | | The API contract for planstate_tree_walker() callbacks is that they take a PlanState pointer and a context pointer. Somebody figured they could save a couple lines of code by ignoring that, and passing ExecShutdownNode itself as the walker even though it has but one argument. Somewhat remarkably, we've gotten away with that so far. However, it seems clear that the upcoming C2x standard means to forbid such cases, and compilers that actively break such code likely won't be far behind. So spend the extra few lines of code to do it honestly with a separate walker function. In HEAD, we might as well go further and remove ExecShutdownNode's useless return value. I left that as-is in back branches though, to forestall complaints about ABI breakage. Back-patch, with the thought that this might become of practical importance before our stable branches are all out of service. It doesn't seem to be fixing any live bug on any currently known platform, however. Discussion: https://postgr.es/m/208054.1663534665@sss.pgh.pa.us
* Revert SQL/JSON featuresAndrew Dunstan2022-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
* Remove unused fields from ExprEvalStepDavid Rowley2022-08-03
| | | | | | | These were added recently by 1349d2790. Reported-by: Zhihong Yu Discussion: https://postgr.es/m/CALNJ-vTi+YDuAWKp4Z_Dv=mrz=aq81qTg0D7wzc8y7rS_+i_cw@mail.gmail.com
* Improve performance of ORDER BY / DISTINCT aggregatesDavid Rowley2022-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been executed by always performing a sort in nodeAgg.c to sort the tuples in the current group into the correct order before calling the transition function on the sorted tuples. This was not great as often there might be an index that could have provided pre-sorted input and allowed the transition functions to be called as the rows come in, rather than having to store them in a tuplestore in order to sort them once all the tuples for the group have arrived. Here we change the planner so it requests a path with a sort order which supports the most amount of ORDER BY / DISTINCT aggregate functions and add new code to the executor to allow it to support the processing of ORDER BY / DISTINCT aggregates where the tuples are already sorted in the correct order. Since there can be many ORDER BY / DISTINCT aggregates in any given query level, it's very possible that we can't find an order that suits all of these aggregates. The sort order that the planner chooses is simply the one that suits the most aggregate functions. We take the most strictly sorted variation of each order and see how many aggregate functions can use that, then we try again with the order of the remaining aggregates to see if another order would suit more aggregate functions. For example: SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ... would request the sort order to be {a, b} because {a} is a subset of the sort order of {a,b}, but; SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ... would just pick a plan ordered by {a} (we give precedence to aggregates which are earlier in the targetlist). SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ... would choose to order by {b} since two aggregates suit that vs just one that requires input ordered by {a}. Author: David Rowley Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com
* Automatically generate node support functionsPeter Eisentraut2022-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
* Remove size increase in ExprEvalStep caused by hashed saopsDavid Rowley2022-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 50e17ad28 increased the size of ExprEvalStep from 64 bytes up to 88 bytes. Lots of effort was spent during the development of the current expression evaluation code to make an instance of this struct as small as possible. Making this struct larger than needed reduces CPU cache efficiency during expression evaluation which causes noticeable slowdowns during query execution. In order to reduce the size of the struct, here we remove the fn_addr field. The values from this field can be obtained via fcinfo, just with some extra pointer dereferencing. The extra indirection does not seem to cause any noticeable slowdowns. Various other fields have been moved into the ScalarArrayOpExprHashTable struct. These fields are only used when the ScalarArrayOpExprHashTable pointer has already been dereferenced, so no additional pointer dereferences occur for these. Here we also make hash_fcinfo_data the last field in ScalarArrayOpExprHashTable so that we can avoid a further pointer dereference to get the FunctionCallInfoBaseData. This also saves a call to palloc(). 50e17ad28 was added in 14, but it's too late to adjust the size of the ExprEvalStep in that version, so here we just backpatch to 15, which is currently in beta. Author: Andres Freund, David Rowley Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Backpatch-through: 15
* expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size.Andres Freund2022-07-05
| | | | | | | | | | | | | | The new expression step types increased the size of ExprEvalStep by ~4 for all types of expression steps, slowing down expression evaluation noticeably. Move them out of line. There's other issues with these expression steps, but addressing them is largely independent of this aspect. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Backpatch: 15-
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)Michael Paquier2022-04-08
| | | | | | | | | | | | | | | | | | | | Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O timing spent reading and writing shared and local buffers. This commit adds on top of that the I/O timing for temporary buffers in the output of EXPLAIN (for spilled external sorts, hashes, materialization. etc). This can be helpful for users in cases where the I/O related to temporary buffers is the bottleneck. Like its cousin, this information is available only when track_io_timing is enabled. Playing the patch, this is showing an extra overhead of up to 1% even when using gettimeofday() as implementation for interval timings, which is slightly within the usual range noise still that's measurable. Author: Masahiko Sawada Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud, Ranier Vilela Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
* Revert "Rewrite some RI code to avoid using SPI"Alvaro Herrera2022-04-07
| | | | | | | This reverts commit 99392cdd78b788295e52b9f4942fa11992fd5ba9. We'd rather rewrite ri_triggers.c as a whole rather than piecemeal. Discussion: https://postgr.es/m/E1ncXX2-000mFt-Pe@gemulon.postgresql.org
* Rewrite some RI code to avoid using SPIAlvaro Herrera2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify the subroutines called by RI trigger functions that want to check if a given referenced value exists in the referenced relation to simply scan the foreign key constraint's unique index, instead of using SPI to execute SELECT 1 FROM referenced_relation WHERE ref_key = $1 This saves a lot of work, especially when inserting into or updating a referencing relation. This rewrite allows to fix a PK row visibility bug caused by a partition descriptor hack which requires ActiveSnapshot to be set to come up with the correct set of partitions for the RI query running under REPEATABLE READ isolation. We now set that snapshot indepedently of the snapshot to be used by the PK index scan, so the two no longer interfere. The buggy output in src/test/isolation/expected/fk-snapshot.out of the relevant test case added by commit 00cb86e75d6d has been corrected. (The bug still exists in branch 14, however, but this fix is too invasive to backpatch.) Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Li Japin <japinli@hotmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
* Refactor and cleanup runtime partition prune code a littleAlvaro Herrera2022-04-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Move the execution pruning initialization steps that are common between both ExecInitAppend() and ExecInitMergeAppend() into a new function ExecInitPartitionPruning() defined in execPartition.c. Those steps include creation of a PartitionPruneState to be used for all instances of pruning and determining the minimal set of child subplans that need to be initialized by performing initial pruning if needed, and finally adjusting the subplan_map arrays in the PartitionPruneState to reflect the new set of subplans remaining after initial pruning if it was indeed performed. ExecCreatePartitionPruneState() is no longer exported out of execPartition.c and has been renamed to CreatePartitionPruneState() as a local sub-routine of ExecInitPartitionPruning(). * Likewise, ExecFindInitialMatchingSubPlans() that was in charge of performing initial pruning no longer needs to be exported. In fact, since it would now have the same body as the more generally named ExecFindMatchingSubPlans(), except differing in the value of initial_prune passed to the common subroutine find_matching_subplans_recurse(), it seems better to remove it and add an initial_prune argument to ExecFindMatchingSubPlans(). * Add an ExprContext field to PartitionPruneContext to remove the implicit assumption in the runtime pruning code that the ExprContext to use to compute pruning expressions that need one can always rely on the PlanState providing it. A future patch will allow runtime pruning (at least the initial pruning steps) to be performed without the corresponding PlanState yet having been created, so this will help. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqEYCpEqh2LMDOp9mT+4-QoVe8HgFMKBjntEMCTZLpcCCA@mail.gmail.com
* JSON_TABLEAndrew Dunstan2022-04-04
| | | | | | | | | | | | | | | | This feature allows jsonb data to be treated as a table and thus used in a FROM clause like other tabular data. Data can be selected from the jsonb using jsonpath expressions, and hoisted out of nested structures in the jsonb to form multiple rows, more or less like an outer join. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru
* SQL JSON functionsAndrew Dunstan2022-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | This Patch introduces three SQL standard JSON functions: JSON() (incorrectly mentioned in my commit message for f4fb45d15c) JSON_SCALAR() JSON_SERIALIZE() JSON() produces json values from text, bytea, json or jsonb values, and has facilitites for handling duplicate keys. JSON_SCALAR() produces a json value from any scalar sql value, including json and jsonb. JSON_SERIALIZE() produces text or bytea from input which containis or represents json or jsonb; For the most part these functions don't add any significant new capabilities, but they will be of use to users wanting standard compliant JSON handling. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru
* SQL/JSON query functionsAndrew Dunstan2022-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces the SQL/JSON functions for querying JSON data using jsonpath expressions. The functions are: JSON_EXISTS() JSON_QUERY() JSON_VALUE() All of these functions only operate on jsonb. The workaround for now is to cast the argument to jsonb. JSON_EXISTS() tests if the jsonpath expression applied to the jsonb value yields any values. JSON_VALUE() must return a single value, and an error occurs if it tries to return multiple values. JSON_QUERY() must return a json object or array, and there are various WRAPPER options for handling scalar or multi-value results. Both these functions have options for handling EMPTY and ERROR conditions. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru