aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor
Commit message (Collapse)AuthorAge
* Fix RLS policy usage in MERGE.Dean Rasheed2023-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | If MERGE executes an UPDATE action on a table with row-level security, the code incorrectly applied the WITH CHECK clauses from the target table's INSERT policies to new rows, instead of the clauses from the table's UPDATE policies. In addition, it failed to check new rows against the target table's SELECT policies, if SELECT permissions were required (likely to always be the case). In addition, if MERGE executes a DO NOTHING action for matched rows, the code incorrectly applied the USING clauses from the target table's DELETE policies to existing target tuples. These policies were applied as checks that would throw an error, if they did not pass. Fix this, so that a MERGE UPDATE action applies the same RLS policies as a plain UPDATE query with a WHERE clause, and a DO NOTHING action does not apply any RLS checks (other than adding clauses from SELECT policies to the join). Back-patch to v15, where MERGE was introduced. Dean Rasheed, reviewed by Stephen Frost. Security: CVE-2023-39418
* Add more SQL/JSON constructor functionsAmit Langote2023-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This Patch introduces three SQL standard JSON functions: JSON() JSON_SCALAR() JSON_SERIALIZE() JSON() produces json values from text, bytea, json or jsonb values, and has facilitites for handling duplicate keys. JSON_SCALAR() produces a json value from any scalar sql value, including json and jsonb. JSON_SERIALIZE() produces text or bytea from input which containis or represents json or jsonb; For the most part these functions don't add any significant new capabilities, but they will be of use to users wanting standard compliant JSON handling. Catversion bumped as this changes ruleutils.c. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Peter Eisentraut Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* Remove unnecessary checks for indexes for REPLICA IDENTITY FULL tables.Masahiko Sawada2023-07-25
| | | | | | | | | | | | | | | | | | | | | | Previously, when selecting an usable index for update/delete for the REPLICA IDENTITY FULL table, in IsIndexOnlyExpression(), we used to check if all index fields are not expressions. However, it was not necessary, because it is enough to check if only the leftmost index field is not an expression (and references the remote table column) and this check has already been done by RemoteRelContainsLeftMostColumnOnIdx(). This commit removes IsIndexOnlyExpression() and RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable indexes for REPLICA IDENTITY FULL tables are now performed by IsIndexUsableForReplicaIdentityFull(). Backpatch this to remain the code consistent. Reported-by: Peter Smith Reviewed-by: Amit Kapila, Önder Kalacı Discussion: https://postgr.es/m/CAHut%2BPsGRE5WSsY0jcLHJEoA17MrbP9yy8FxdjC_ZOAACxbt%2BQ%40mail.gmail.com Backpatch-through: 16
* Allow the use of a hash index on the subscriber during replication.Amit Kapila2023-07-14
| | | | | | | | | | | | | | | Commit 89e46da5e5 allowed using BTREE indexes that are neither PRIMARY KEY nor REPLICA IDENTITY on the subscriber during apply of update/delete. This patch extends that functionality to also allow HASH indexes. We explored supporting other index access methods as well but they don't have a fixed strategy for equality operation which is required by the current infrastructure in logical replication to scan the indexes. Author: Kuroda Hayato Reviewed-by: Peter Smith, Onder Kalaci, Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB58669D7414E59664E17A5827F522A@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Doc: clarify the conditions of usable indexes for REPLICA IDENTITY FULL tables.Masahiko Sawada2023-07-13
| | | | | | | | | | | | | | Commit 89e46da5e allowed REPLICA IDENTITY FULL tables to use an index on the subscriber during apply of update/delete. This commit clarifies in the documentation that the leftmost field of candidate indexes must be a column (not an expression) that references the published relation column. The source code comments are also updated accordingly. Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com Backpatch-through: 16
* Don't include CaseTestExpr in JsonValueExpr.formatted_exprAmit Langote2023-07-13
| | | | | | | | | | | | | | | | | | | | | | | | A CaseTestExpr is currently being put into JsonValueExpr.formatted_expr as placeholder for the result of evaluating JsonValueExpr.raw_expr, which in turn is evaluated separately. Though, there's no need for this indirection if raw_expr itself can be embedded into formatted_expr and evaluated as part of evaluating the latter, especially as there is no special reason to evaluate it separately. So this commit makes it so. As a result, JsonValueExpr.raw_expr no longer needs to be evaluated in ExecInterpExpr(), eval_const_exprs_mutator() etc. and is now only used for displaying the original "unformatted" expression in ruleutils.c. While at it, this also removes the function makeCaseTestExpr(), because the code in makeJsonConstructorExpr() looks more readable without it IMO and isn't used by anyone else either. Finally, a note is added in the comment above CaseTestExpr's definition that JsonConstructorExpr is also using it. Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* Fix memory leak in Incremental Sort rescansTomas Vondra2023-07-02
| | | | | | | | | | | | | | | | | | | | | | | | The Incremental Sort had a couple issues, resulting in leaking memory during rescans, possibly triggering OOM. The code had a couple of related flaws: 1. During rescans, the sort states were reset but then also set to NULL (despite the comment saying otherwise). ExecIncrementalSort then sees NULL and initializes a new sort state, leaking the memory used by the old one. 2. Initializing the sort state also automatically rebuilt the info about presorted keys, leaking the already initialized info. presorted_keys was also unnecessarily reset to NULL. Patch by James Coleman, based on patches by Laurenz Albe and Tom Lane. Backpatch to 13, where Incremental Sort was introduced. Author: James Coleman, Laurenz Albe, Tom Lane Reported-by: Laurenz Albe, Zu-Ming Jiang Backpatch-through: 13 Discussion: https://postgr.es/m/b2bd02dff61af15e3526293e2771f874cf2a3be7.camel%40cybertec.at Discussion: https://postgr.es/m/db03c582-086d-e7cd-d4a1-3bc722f81765%40inf.ethz.ch
* Fix order of operations in ExecEvalFieldStoreDeForm().Tom Lane2023-06-29
| | | | | | | | | | | | | | | | | | | | | | | | | If the given composite datum is toasted out-of-line, DatumGetHeapTupleHeader will perform database accesses to detoast it. That can invalidate the result of get_cached_rowtype, as documented (perhaps not plainly enough) in that function's API spec; which leads to strange errors or crashes when we try to use the TupleDesc to read the tuple. In short then, trying to update a field of a composite column could fail intermittently if the overall column value is wide enough to require toasting. We can fix the bug at no cost by just changing the order of operations, since we don't need the TupleDesc until after detoasting. (Other callers of get_cached_rowtype appear to get this right already, so there's only one bug.) Note that the added regression test case reveals this bug reliably only with debug_discard_caches/CLOBBER_CACHE_ALWAYS. Per bug #17994 from Alexander Lakhin. Sadly, this patch does not fix the missing-values issue revealed in the bug discussion; we'll need some more work to cover that. Discussion: https://postgr.es/m/17994-5c7100b51b4790e9@postgresql.org
* Pre-beta2 mechanical code beautification.Tom Lane2023-06-20
| | | | | | | | | Run pgindent and pgperltidy. It seems we're still some ways away from all committers doing this automatically. Now that we have a buildfarm animal that will whine about poorly-indented code, we'll try to keep the tree more tidy. Discussion: https://postgr.es/m/3156045.1687208823@sss.pgh.pa.us
* Retain relkind too in RTE_SUBQUERY entries for views.Amit Langote2023-06-14
| | | | | | | | | | | | | | | | | | 47bb9db75 modified the ApplyRetrieveRule()'s conversion of a view's original RTE_RELATION entry into an RTE_SUBQUERY one to retain relid, rellockmode, and perminfoindex so that the executor can lock the view and check its permissions. It seems better to also retain relkind for cross-checking that the exception of an RTE_SUBQUERY entry being allowed to carry relation details only applies to views, so do so. Bump catversion because this changes the output format of RTE_SUBQUERY RTEs. Suggested-by: David Steele <david@pgmasters.net> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3953179e-9540-e5d1-a743-4bef368785b0%40pgmasters.net
* Use per-tuple context in ExecGetAllUpdatedColsTomas Vondra2023-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fc22b6623b (generated columns) replaced ExecGetUpdatedCols() with ExecGetAllUpdatedCols() in a couple places handling UPDATE (triggers and lock mode). However, ExecGetUpdatedCols() did exec_rt_fetch() while ExecGetAllUpdatedCols() also allocates memory through bms_union() without paying attention to the memory context and happened to use the long-lived ExecutorState, leaking the memory until the end of the query. The amount of leaked memory is proportional to the number of (updated) attributes, types of UPDATE triggers, and the number of processed rows (which for UPDATE ... FROM ... may be much higher than updated rows). Fixed by switching to the per-tuple context in GetAllUpdatedColumns(). This is fine for all in-core callers, but external callers may need to copy the result. But we're not aware of any such callers. Note the issue was introduced by fc22b6623b, but the macros were later renamed by f50e888990. Backpatch to 12, where the issue was introduced. Reported-by: Tomas Vondra Reviewed-by: Andres Freund, Tom Lane, Jakub Wartak Backpatch-through: 12 Discussion: https://postgr.es/m/222a3442-7f7d-246c-ed9b-a76209d19239@enterprisedb.com
* Pre-beta mechanical code beautification.Tom Lane2023-05-19
| | | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
* Fix misbehavior of EvalPlanQual checks with multiple result relations.Tom Lane2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit 86dc90056 made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of 3f7836ff6 and 4b3e37993, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where 86dc90056 came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
* Allocate hash join files in a separate memory contextTomas Vondra2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Should a hash join exceed memory limit, the hashtable is split up into multiple batches. The number of batches is doubled each time a given batch is determined not to fit in memory. Each batch file is allocated with a block-sized buffer for buffering tuples and parallel hash join has additional sharedtuplestore accessor buffers. In some pathological cases requiring a lot of batches, often with skewed data, bad stats, or very large datasets, users can run out-of-memory solely from the memory overhead of all the batch files' buffers. Batch files were allocated in the ExecutorState memory context, making it very hard to identify when this batch explosion was the source of an OOM. This commit allocates the batch files in a dedicated memory context, making it easier to identify the cause of an OOM and work to avoid it. Based on initial draft by Tomas Vondra, with significant reworks and improvements by Jehan-Guillaume de Rorthais. Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Author: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20190421114618.z3mpgmimc3rmubi4@development Discussion: https://postgr.es/m/20230504193006.1b5b9622%40karst#273020ff4061fc7a2fbb1ba96b281f17
* Describe hash join implementationTomas Vondra2023-05-19
| | | | | | | | | | Add a high level description of our implementation of the hybrid hash join algorithm to the block comment in nodeHashjoin.c. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://postgr.es/m/20230516160051.4267a800%40karst
* Remove stray mid-sentence tabs in commentsPeter Eisentraut2023-05-19
|
* Add back SQLValueFunction for SQL keywordsMichael Paquier2023-05-17
| | | | | | | | | | | | | | | | | | | | | | | | This is equivalent to a revert of f193883 and fb32748, with the addition that the declaration of the SQLValueFunction node needs to gain a couple of node_attr for query jumbling. The performance impact of removing the function call inlining is proving to be too huge for some workloads where these are used. A worst-case test case of involving only simple SELECT queries with a SQL keyword is proving to lead to a reduction of 10% in TPS via pgbench and prepared queries on a high-end machine. None of the tests I ran back for this set of changes saw such a huge gap, but Alexander Lakhin and Andres Freund have found that this can be noticeable. Keeping the older performance would mean to do more inlining in the executor when using COERCE_SQL_SYNTAX for a function expression, similarly to what SQLValueFunction does. This requires more redesign work and there is little time until 16beta1 is released, so for now reverting the change is the best way forward, bringing back the previous performance. Bump catalog version. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/b32bed1b-0746-9b20-1472-4bdc9ca66d52@gmail.com
* Mark internal messages as no longer translatableAlvaro Herrera2023-05-16
| | | | | | | | | | | The problem that these messages protect against can only occur because a corrupted hash spill file was written, i.e., a Postgres bug. There's no reason to have them as translatable. Backpatch to 15, where these messages were changed by commit c4649cce39a4. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20230510175407.dwa5v477pw62ikyx@alvherre.pgsql
* Fix ExecCheckPermissions call in RI_Initial_CheckAlvaro Herrera2023-05-04
| | | | | | | | | | | | | | | | | | RI_Initial_Check was setting up a list of RTEPermissionInfo for ExecCheckPermissions() wrong, and the problem is subtle enough that it doesn't have any immediate effect in core code. However, if an extension is using the ExecutorCheckPerms_hook, then it would get the wrong parameters and perhaps arrive at a wrong conclusion, or outright malfunction. Fix by constructing that list and the RTE list more honestly. We also add an assertion check to verify that these lists match. This new assertion would have caught this bug. Co-authored-by: Олег Целебровский (Oleg Tselebrovskii) <o.tselebrovskiy@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/3722b7a2cbe27a1796ee40824bd86dd1@postgrespro.ru
* Revert "Move PartitionPruneInfo out of plan nodes into PlannedStmt"Alvaro Herrera2023-05-04
| | | | | | | | | | | | | This reverts commit ec386948948c and its fixup 589bb816499e. This change was intended to support query planning avoiding acquisition of locks on partitions that were going to be pruned; however, the overall project took a different direction at [1] and this bit is no longer needed. Put things back the way they were as agreed in [2], to avoid unnecessary complexity. Discussion: [1] https://postgr.es/m/4191508.1674157166@sss.pgh.pa.us Discussion: [2] https://postgr.es/m/20230502175409.kcoirxczpdha26wt@alvherre.pgsql
* Fix typos in commentsMichael Paquier2023-05-02
| | | | | | | | | The changes done in this commit impact comments with no direct user-visible changes, with fixes for incorrect function, variable or structure names. Author: Alexander Lakhin Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
* Fix buffer refcount leak with FDW bulk insertsMichael Paquier2023-04-25
| | | | | | | | | | | | | | | | | | | | | The leak would show up when using batch inserts with foreign tables included in a partition tree, as the slots used in the batch were not reset once processed. In order to fix this problem, some ExecClearTuple() are added to clean up the slots used once a batch is filled and processed, mapping with the number of slots currently in use as tracked by the counter ri_NumSlots. This buffer refcount leak has been introduced in b676ac4 with the addition of the executor facility to improve bulk inserts for FDWs, so backpatch down to 14. Alexander has provided the patch (slightly modified by me). The test for postgres_fdw comes from me, based on the test case that the author has sent in the report. Author: Alexander Pyhalov Discussion: https://postgr.es/m/b035780a740efd38dc30790c76927255@postgrespro.ru Backpatch-through: 14
* Rename ExecAggTransReparent, and improve its documentation.Tom Lane2023-04-24
| | | | | | | | | | | | | The name of this function suggests that it ought to reparent R/W expanded objects to be children of the persistent aggcontext, instead of copying them. In fact it does no such thing, and if you try to make it do so you will see multiple regression failures. Rename it to the less-misleading ExecAggCopyTransValue, and add commentary about why that attractive-sounding optimization won't work. Also adjust comments at call sites, some of which were describing logic that has since been moved into ExecAggCopyTransValue. Discussion: https://postgr.es/m/3004282.1681930251@sss.pgh.pa.us
* Fix various typos and incorrect/outdated name referencesDavid Rowley2023-04-19
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
* Fix various typosDavid Rowley2023-04-18
| | | | | | | | | | | | This fixes many spelling mistakes in comments, but a few references to invalid parameter names, function names and option names too in comments and also some in string constants Also, fix an #undef that was undefining the incorrect definition Author: Alexander Lakhin Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
* Ensure result of an aggregate's finalfunc is made read-only.Tom Lane2023-04-16
| | | | | | | | | | | | | | | The finalfunc might return a read-write expanded object. If we de-duplicate multiple call sites for the aggregate, any function(s) receiving the aggregate result earlier could alter or destroy the value that reaches the ones called later. This is a brown-paper-bag bug in commit 42b746d4c, because we actually considered the need for read-only-ness but failed to realize that it applied to the case with a finalfunc as well as the case without. Per report from Justin Pryzby. New error in HEAD, no need for back-patch. Discussion: https://postgr.es/m/ZDm5TuKsh3tzoEjz@telsasoft.com
* Fix assignment to array of domain over composite, redux.Tom Lane2023-04-15
| | | | | | | | | | | | | | | | Commit 3e310d837 taught isAssignmentIndirectionExpr() to look through CoerceToDomain nodes. That's not sufficient, because since commit 04fe805a1 it's been possible for the planner to simplify CoerceToDomain to RelabelType when the domain has no constraints to enforce. So we need to look through RelabelType too. Per bug #17897 from Alexander Lakhin. Although 3e310d837 was back-patched to v11, it seems sufficient to apply this change to v12 and later, since 04fe805a1 came in in v12. Dmitry Dolgov Discussion: https://postgr.es/m/17897-4216c546c3874044@postgresql.org
* Fix PHJ match bit initialization.Thomas Munro2023-04-14
| | | | | | | | | | | | | | | | | Hash join tuples reuse the HOT status bit to indicate match status during hash join execution. Correct reuse requires clearing the bit in all tuples. Serial hash join and parallel multi-batch hash join do so upon inserting the tuple into the hashtable. Single batch parallel hash join and batch 0 of unexpected multi-batch hash joins forgot to do this. It hadn't come up before because hashtable tuple match bits are only used for right and full outer joins and parallel ROJ and FOJ were unsupported. 11c2d6fdf5 introduced support for parallel ROJ/FOJ but neglected to ensure the match bits were reset. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/flat/CAMbWs48Nde1Mv%3DBJv6_vXmRKHMuHZm2Q_g4F6Z3_pn%2B3EV6BGQ%40mail.gmail.com
* Remove overzealous assertion from PHJ.Thomas Munro2023-04-13
| | | | | | | | | | | | | | We can't assert that we're the only process attached to a barrier after BarrierArriveAndDetachExceptLast(). Although that'll be true almost always, a late-starting parallel worker can attach very briefly (that is, immediately detach after checking the phase) right at that moment. BarrierArriveAndDetachExceptLast() already contains an assertion like that, but it holds a spinlock preventing the race. This thinko caused a one-off failure on build farm animal chimaera. Diagnosed-by: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3590249.1680971629@sss.pgh.pa.us
* Fix row tracking in pg_stat_statements with extended query protocolMichael Paquier2023-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_stat_statements relies on EState->es_processed to count the number of rows processed by ExecutorRun(). This proves to be a problem under the extended query protocol when the result of a query is fetched through more than one call of ExecutorRun(), as es_processed is reset each time ExecutorRun() is called. This causes pg_stat_statements to report the number of rows calculated in the last execute fetch, rather than the global sum of all the rows processed. As pquery.c tells, this is a problem when a portal does not use holdStore. For example, DMLs with RETURNING would report a correct tuple count as these do one execution cycle when the query is first executed to fill in the portal's store with one ExecutorRun(), feeding on the portal's store for each follow-up execute fetch depending on the fetch size requested by the client. The fix proposed for this issue is simple with the addition of an extra counter in EState that's preserved across multiple ExecutorRun() calls, incremented with the value calculated in es_processed. This approach is not back-patchable, unfortunately. Note that libpq does not currently give any way to control the fetch size when using the extended v3 protocol, meaning that in-core testing is not possible yet. This issue can be easily verified with the JDBC driver, though, with *autocommit disabled*. Hence, having in-core tests requires more features, left for future discussion: - At least two new libpq routines splitting PQsendQueryGuts(), one for the bind/describe and a second for a series of execute fetches with a custom fetch size, likely in a fashion similar to what JDBC does. - A psql meta-command for the execute phase. This part is not strictly mandatory, still it could be handy. Reported-by: Andrew Dunstan (original discovery by Simon Siggs) Author: Sami Imseih Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/EBE6C507-9EB6-4142-9E4D-38B1673363A7@amazon.com Discussion: https://postgr.es/m/c90890e7-9c89-c34f-d3c5-d5c763a34bd8@dunslane.net
* Support "Right Anti Join" plan shapes.Tom Lane2023-04-05
| | | | | | | | | | | | | Merge and hash joins can support antijoin with the non-nullable input on the right, using very simple combinations of their existing logic for right join and anti join. This gives the planner more freedom about how to order the join. It's particularly useful for hash join, since we may now have the option to hash the smaller table instead of the larger. Richard Guo, reviewed by Ronan Dunklau and myself Discussion: https://postgr.es/m/CAMbWs48xh9hMzXzSy3VaPzGAz+fkxXXTUbCLohX1_L8THFRm2Q@mail.gmail.com
* Revert 764da7710bAlexander Korotkov2023-04-03
| | | | Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
* Revert 11470f544eAlexander Korotkov2023-04-03
| | | | Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
* SQL/JSON: support the IS JSON predicateAlvaro Herrera2023-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the SQL standard IS JSON predicate. It operates on text and bytea values representing JSON, as well as on the json and jsonb types. Each test has IS and IS NOT variants and supports a WITH UNIQUE KEYS flag. The tests are: IS JSON [VALUE] IS JSON ARRAY IS JSON OBJECT IS JSON SCALAR These should be self-explanatory. The WITH UNIQUE KEYS flag makes these return false when duplicate keys exist in any object within the value, not necessarily directly contained in the outermost object. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
* Move ExecEvalJsonConstructor new function to a more natural placeAlvaro Herrera2023-03-31
| | | | | Commit 7081ac46ace8 put it at the end of the file, but that doesn't look very nice.
* Parallel Hash Full Join.Thomas Munro2023-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | Full and right outer joins were not supported in the initial implementation of Parallel Hash Join because of deadlock hazards (see discussion). Therefore FULL JOIN inhibited parallelism, as the other join strategies can't do that in parallel either. Add a new PHJ phase PHJ_BATCH_SCAN that scans for unmatched tuples on the inner side of one batch's hash table. For now, sidestep the deadlock problem by terminating parallelism there. The last process to arrive at that phase emits the unmatched tuples, while others detach and are free to go and work on other batches, if there are any, but otherwise they finish the join early. That unfairness is considered acceptable for now, because it's better than no parallelism at all. The build and probe phases are run in parallel, and the new scan-for-unmatched phase, while serial, is usually applied to the smaller of the two relations and is either limited by some multiple of work_mem, or it's too big and is partitioned into batches and then the situation is improved by batch-level parallelism. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BA6ftXPz4oe92%2Bx8Er%2BxpGZqto70-Q_ERwRaSyA%3DafNg%40mail.gmail.com
* Fix inconsistencies and style issues in new SQL/JSON codeAlvaro Herrera2023-03-30
| | | | | | Reported by Alexander Lakhin. Discussion: https://postgr.es/m/60483139-5c34-851d-baee-6c0d014e1710@gmail.com
* Fix outdated comments regarding TupleTableSlotsDavid Rowley2023-03-30
| | | | | | | | | | | | The tts_flag is named TTS_FLAG_SHOULDFREE, so use that instead of TTS_SHOULDFREE, which is the name of the macro that checks for that flag. Additionally, 4da597edf got rid of the TupleTableSlot.tts_tuple field but forgot to update a comment which referenced that field. Fix that. Reported-by: Zhen Mingyang <zhenmingyang@yeah.net> Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1a96696c.9d3.187193989c3.Coremail.zhenmingyang@yeah.net
* SQL/JSON: add standard JSON constructor functionsAlvaro Herrera2023-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces the SQL/JSON standard-conforming constructors for JSON types: JSON_ARRAY() JSON_ARRAYAGG() JSON_OBJECT() JSON_OBJECTAGG() Most of the functionality was already present in PostgreSQL-specific functions, but these include some new functionality such as the ability to skip or include NULL values, and to allow duplicate keys or throw error when they are found, as well as the standard specified syntax to specify output type and format. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org
* Simplify useless 0L constantsPeter Eisentraut2023-03-29
| | | | | | | In ancient times, these belonged to arguments or fields that were actually of type long, but now they are not anymore, so this "L" decoration is just confusing. (Some other 0L and other "L" constants remain, where they are actually associated with a long type.)
* Fix oversights in array manipulation.Tom Lane2023-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nested-arrays code path in ExecEvalArrayExpr() used palloc to allocate the result array, whereas every other array-creating function has used palloc0 since 18c0b4ecc. This mostly works, but unused bits past the end of the nulls bitmap may end up undefined. That causes valgrind complaints with -DWRITE_READ_PARSE_PLAN_TREES, and could cause planner misbehavior as cited in 18c0b4ecc. There seems no very good reason why we should strive to avoid palloc0 in just this one case, so fix it the easy way with s/palloc/palloc0/. While looking at that I noted that we also failed to check for overflow of "nbytes" and "nitems" while summing the sizes of the sub-arrays, potentially allowing a crash due to undersized output allocation. For "nbytes", follow the policy used by other array-munging code of checking for overflow after each addition. (As elsewhere, the last addition of the array's overhead space doesn't need an extra check, since palloc itself will catch a value between 1Gb and 2Gb.) For "nitems", there's no very good reason to sum the inputs at all, since we can perfectly well use ArrayGetNItems' result instead of ignoring it. Per discussion of this bug, also remove redundant zeroing of the nulls bitmap in array_set_element and array_set_slice. Patch by Alexander Lakhin and myself, per bug #17858 from Alexander Lakhin; thanks also to Richard Guo. These bugs are a dozen years old, so back-patch to all supported branches. Discussion: https://postgr.es/m/17858-8fd287fd3663d051@postgresql.org
* Add SysCacheGetAttrNotNull for guaranteed not-null attrsDaniel Gustafsson2023-03-25
| | | | | | | | | | | | | When extracting an attr from a cached tuple in the syscache with SysCacheGetAttr the isnull parameter must be checked in case the attr cannot be NULL. For cases when this is known beforehand, a wrapper is introduced which perform the errorhandling internally on behalf of the caller, invoking an elog in case of a NULL attr. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
* Invent GENERIC_PLAN option for EXPLAIN.Tom Lane2023-03-24
| | | | | | | | | | | | | | | | | This provides a very simple way to see the generic plan for a parameterized query. Without this, it's necessary to define a prepared statement and temporarily change plan_cache_mode, which is a bit tedious. One thing that's a bit of a hack perhaps is that we disable execution-time partition pruning when the GENERIC_PLAN option is given. That's because the pruning code may attempt to fetch the value of one of the parameters, which would fail. Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg, Michel Pelletier, Jim Jones, and myself Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
* Ignore generated columns during apply of update/delete.Amit Kapila2023-03-23
| | | | | | | | | | | | We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having generated columns. We didn't use to ignore generated columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci Reviewed-by: Shi yu, Amit Kapila Backpatch-through: 12 Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
* Improve the naming of Parallel Hash Join phases.Thomas Munro2023-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | * Commit 3048898e dropped -ING from PHJ wait event names. Update the corresponding barrier phases names to match. * Rename the "DONE" phases to "FREE". That's symmetrical with "ALLOCATE", and names the activity that actually happens in that phase (as we do for the other phases) rather than a state. The bug fixed by commit 8d578b9b might have been more obvious with this name. * Rename the batch/bucket growth barriers' "ALLOCATE" phases to "REALLOCATE", a better description of what they do. * Update the high level comments about phases to highlight phases are executed by a single process with an asterisk (mostly memory management phases). No behavior change, as this is just improving internal identifiers. The only user-visible sign of this is that a couple of wait events' display names change from "...Allocate" to "...Reallocate" in pg_stat_activity, to stay in sync with the internal names. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BMDpwF2Eo2LAvzd%3DpOh81wUTsrwU1uAwR-v6OGBB6%2B7g%40mail.gmail.com
* Allow locking updated tuples in tuple_update() and tuple_delete()Alexander Korotkov2023-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, in read committed transaction isolation mode (default), we have the following sequence of actions when tuple_update()/tuple_delete() finds the tuple updated by concurrent transaction. 1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which returns TM_Updated. 2. Lock tuple with tuple_lock(). 3. Re-evaluate plan qual (recheck if we still need to update/delete and calculate the new tuple for update). 4. Second attempt to update/delete tuple with tuple_update()/tuple_delete(). This attempt should be successful, since the tuple was previously locked. This patch eliminates step 2 by taking the lock during first tuple_update()/tuple_delete() call. Heap table access method saves some efforts by checking the updated tuple once instead of twice. Future undo-based table access methods, which will start from the latest row version, can immediately place a lock there. The code in nodeModifyTable.c is simplified by removing the nested switch/case. Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp Reviewed-by: Andres Freund, Chris Travers
* Evade extra table_tuple_fetch_row_version() in ExecUpdate()/ExecDelete()Alexander Korotkov2023-03-23
| | | | | | | | | | | When we lock tuple using table_tuple_lock() then we at the same time fetch the locked tuple to the slot. In this case we can skip extra table_tuple_fetch_row_version() thank to we've already fetched the 'old' tuple and nobody can change it concurrently since it's locked. Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp Reviewed-by: Andres Freund, Chris Travers
* Ignore dropped columns during apply of update/delete.Amit Kapila2023-03-21
| | | | | | | | | | | We fail to apply updates and deletes when the REPLICA IDENTITY FULL is used for the table having dropped columns. We didn't use to ignore dropped columns while doing tuple comparison among the tuples from the publisher and subscriber during apply of updates and deletes. Author: Onder Kalaci, Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CACawEhVQC9WoofunvXg12aXtbqKnEgWxoRx3+v8q32AWYsdpGg@mail.gmail.com
* Fix race in parallel hash join batch cleanup, take II.Thomas Munro2023-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With unlucky timing and parallel_leader_participation=off (not the default), PHJ could attempt to access per-batch shared state just as it was being freed. There was code intended to prevent that by checking for a cleared pointer, but it was racy. Fix, by introducing an extra barrier phase. The new phase PHJ_BUILD_RUNNING means that it's safe to access the per-batch state to find a batch to help with, and PHJ_BUILD_DONE means that it is too late. The last to detach will free the array of per-batch state as before, but now it will also atomically advance the phase, so that late attachers can avoid the hazard. This mirrors the way per-batch hash tables are freed (see phases PHJ_BATCH_PROBING and PHJ_BATCH_DONE). An earlier attempt to fix this (commit 3b8981b6, later reverted) missed one special case. When the inner side is empty (the "empty inner optimization), the build barrier would only make it to PHJ_BUILD_HASHING_INNER phase before workers attempted to detach from the hashtable. In that case, fast-forward the build barrier to PHJ_BUILD_RUNNING before proceeding, so that our later assertions hold and we can still negotiate who is cleaning up. Revealed by build farm failures, where BarrierAttach() failed a sanity check assertion, because the memory had been clobbered by dsa_free(). In non-assert builds, the result could be a segmentation fault. Back-patch to all supported releases. Author: Thomas Munro <thomas.munro@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: David Geier <geidav.pg@gmail.com> Tested-by: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/20200929061142.GA29096%40paquier.xyz
* Ignore BRIN indexes when checking for HOT updatesTomas Vondra2023-03-20
| | | | | | | | | | | | | | | | | | | | | | | | When determining whether an index update may be skipped by using HOT, we can ignore attributes indexed by block summarizing indexes without references to individual tuples that need to be cleaned up. A new type TU_UpdateIndexes provides a signal to the executor to determine which indexes to update - no indexes, all indexes, or only the summarizing indexes. This also removes rd_indexattr list, and replaces it with rd_attrsvalid flag. The list was not used anywhere, and a simple flag is sufficient. This was originally committed as 5753d4ee32, but then got reverted by e3fcca0d0d because of correctness issues. Original patch by Josef Simanek, various fixes and improvements by Tomas Vondra and me. Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra Reviewed-by: Tomas Vondra, Alvaro Herrera Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com