aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor
Commit message (Collapse)AuthorAge
* Don't store intermediate hash values in ExprState->resvalueDavid Rowley2024-10-17
| | | | | | | | | | | | | | | | | | | | | | | adf97c156 made it so ExprStates could support hashing and changed Hash Join to use that instead of manually extracting Datums from tuples and hashing them one column at a time. When hashing multiple columns or expressions, the code added in that commit stored the intermediate hash value in the ExprState's resvalue field. That was a mistake as steps may be injected into the ExprState between each hashing step that look at or overwrite the stored intermediate hash value. EEOP_PARAM_SET is an example of such a step. Here we fix this by adding a new dedicated field for storing intermediate hash values and adjust the code so that all apart from the final hashing step store their result in the intermediate field. In passing, rename a variable so that it's more aligned to the surrounding code and also so a few lines stay within the 80 char margin. Reported-by: Andres Freund Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: https://postgr.es/m/CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com
* Further refine _SPI_execute_plan's rule for atomic execution.Tom Lane2024-10-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2dc1deaea turns out to have been still a brick shy of a load, because CALL statements executing within a plpgsql exception block could still pass the wrong snapshot to stable functions within the CALL's argument list. That happened because standard_ProcessUtility forces isAtomicContext to true if IsTransactionBlock is true, which it always will be inside a subtransaction. Then ExecuteCallStmt would think it does not need to push a new snapshot --- but _SPI_execute_plan didn't do so either, since it thought it was in nonatomic mode. The best fix for this seems to be for _SPI_execute_plan to operate in atomic execution mode if IsSubTransaction() is true, even when the SPI context as a whole is non-atomic. This makes _SPI_execute_plan have the same rules about when non-atomic execution is allowed as _SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed, which seems appropriately symmetric. (If anyone ever tries to allow COMMIT/ROLLBACK inside a subtransaction, this would all need to be rethought ... but I'm unconvinced that such a thing could be logically consistent at all.) For further consistency, also check IsSubTransaction() in SPI_inside_nonatomic_context. That does not matter for its one present-day caller StartTransaction, which can't be reached inside a subtransaction. But if any other callers ever arise, they'd presumably want this definition. Per bug #18656 from Alexander Alehin. Back-patch to all supported branches, like previous fixes in this area. Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org
* Track scan reversals in MergeJoinPeter Eisentraut2024-10-14
| | | | | | | | | | | | | The MergeJoin struct was tracking "mergeStrategies", which were an array of btree strategy numbers, purely for the purpose of comparing it later against btree strategies to determine if the scan direction was forward or reverse. Change that. Instead, track "mergeReversals", an array of bool, to indicate the same without an unfortunate assumption that a strategy number refers specifically to a btree strategy. Author: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
* Introduce two fields in EState to track parallel worker activityMichael Paquier2024-10-09
| | | | | | | | | | | | | | | These fields can be set by executor nodes to record how many parallel workers were planned to be launched and how many of them have been actually launched within the number initially planned. This data is able to give an approximation of the parallel worker draught a system is facing, making easier the tuning of related configuration parameters. These fields will be used by some follow-up patches to populate other parts of the system with their data. Author: Guillaume Lelarge, Benoit Lobréau Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com
* Improve style of two code pathsMichael Paquier2024-10-08
| | | | | | | | | | In execGrouping.c, execTuplesMatchPrepare() was doing a memory allocation that was not necessary when the number of columns was 0. In foreign.c, pg_options_to_table() was assigning twice a variable to the same value. Author: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAqup0agbSzMjSLSTn=OANyCzxENF1+HrSYnr3WyZib7=Q@mail.gmail.com
* Remove assertion checking query ID in execMain.cMichael Paquier2024-10-04
| | | | | | | | | | This assertion has been added by 24f520594809, but Alexander Lakhin has proved that the ExecutorRun() one can be broken by using a PL function that manipulates compute_query_id and track_activities, while the ones in ExecutorFinish() and ExecutorEnd() could be triggered when cleaning up portals at the beginning of a new query execution. Discussion: https://postgr.es/m/b37d8e6c-e83d-e157-8865-1b2460a6aef2@gmail.com
* Expand assertion check for query ID reporting in executorMichael Paquier2024-10-01
| | | | | | | | | | | | | | | | | | As formulated, the assertion added in the executor by 24f520594809 to check that a query ID is set had two problems: - track_activities may be disabled while compute_query_id is enabled, causing the query ID to not be reported to pg_stat_activity. - debug_query_string may not be set in some context. The only path where this would matter is visibly autovacuum, should parallel workers be enabled there at some point. This is not the case currently. There was no test showing the interactions between the query ID and track_activities, so let's add one based on a scan of pg_stat_activity. This assertion is still an experimentation at this stage, but let's see if this shows more paths where query IDs are not properly set while they should. Discussion: https://postgr.es/m/Zvn5616oYXmpXyHI@paquier.xyz
* For inplace update durability, make heap_update() callers wait.Noah Misch2024-09-24
| | | | | | | | | | | | | | | | | | | The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
* Optimize tuplestore usage for WITH RECURSIVE CTEsDavid Rowley2024-09-19
| | | | | | | | | | | | | | | | | | | | | nodeRecursiveunion.c makes use of two tuplestores and, until now, would delete and recreate one of these tuplestores after every recursive iteration. Here we adjust that behavior and instead reuse one of the existing tuplestores and just empty it of all tuples using tuplestore_clear(). This saves some free/malloc roundtrips and has shown a 25-30% performance improvement for queries that perform very little work between recursive iterations. This also paves the way to add some EXPLAIN ANALYZE telemetry output for recursive common table expressions, similar to what was done in 1eff8279d and 95d6e9af0. Previously calling tuplestore_end() would have caused the maximum storage space used to be lost. Reviewed-by: Tatsuo Ishii Discussion: https://postgr.es/m/CAApHDvr9yW0YRiK8A2J7nvyT8g17YzbSfOviEWrghazKZbHbig@mail.gmail.com
* Add some sanity checks in executor for query ID reportingMichael Paquier2024-09-18
| | | | | | | | | | | | | | | | | | | This commit adds three sanity checks in code paths of the executor where it is possible to use hooks, checking that a query ID is reported in pg_stat_activity if compute_query_id is enabled: - ExecutorRun() - ExecutorFinish() - ExecutorEnd() This causes the test in pg_stat_statements added in 933848d16dc9 to complain immediately in ExecutorRun(). The idea behind this commit is to help extensions to detect if they are missing query ID reports when a query goes through the executor. Perhaps this will prove to be a bad idea, but let's see where this experience goes in v18 and newer versions. Reviewed-by: Sami Imseih Discussion: https://postgr.es/m/ZuJb5xCKHH0A9tMN@paquier.xyz
* Add missing query ID reporting in extended query protocolMichael Paquier2024-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds query ID reports for two code paths when processing extended query protocol messages: - When receiving a bind message, setting it to the first Query retrieved from a cached cache. - When receiving an execute message, setting it to the first PlannedStmt stored in a portal. An advantage of this method is that this is able to cover all the types of portals handled in the extended query protocol, particularly these two when the report done in ExecutorStart() is not enough (neither is an addition in ExecutorRun(), actually, for the second point): - Multiple execute messages, with multiple ExecutorRun(). - Portal with execute/fetch messages, like a query with a RETURNING clause and a fetch size that stores the tuples in a first execute message going though ExecutorStart() and ExecuteRun(), followed by one or more execute messages doing only fetches from the tuplestore created in the first message. This corresponds to the case where execute_is_fetch is set, for example. Note that the query ID reporting done in ExecutorStart() is still necessary, as an EXECUTE requires it. Query ID reporting is optimistic and more calls to pgstat_report_query_id() don't matter as the first report takes priority except if the report is forced. The comment in ExecutorStart() is adjusted to reflect better the reality with the extended query protocol. The test added in pg_stat_statements is a courtesy of Robert Haas. This uses psql's \bind metacommand, hence this part is backpatched down to v16. Reported-by: Kaido Vaikla, Erik Wienhold Author: Sami Imseih Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org Backpatch-through: 14
* Add temporal PRIMARY KEY and UNIQUE constraintsPeter Eisentraut2024-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints. These are backed by GiST indexes instead of B-tree indexes, since they are essentially exclusion constraints with = for the scalar parts of the key and && for the temporal part. (previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new part is this:) Because 'empty' && 'empty' is false, the temporal PK/UQ constraint allowed duplicates, which is confusing to users and breaks internal expectations. For instance, when GROUP BY checks functional dependencies on the PK, it allows selecting other columns from the table, but in the presence of duplicate keys you could get the value from any of their rows. So we need to forbid empties. This all means that at the moment we can only support ranges and multiranges for temporal PK/UQs, unlike the original patch (above). Documentation and tests for this are added. But this could conceivably be extended by introducing some more general support for the notion of "empty" for other types. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY stepsAmit Langote2024-09-09
| | | | | | | | | | | | | | | | | When the ON ERROR / ON EMPTY behavior is to return NULL, returning NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's no need to create separate steps to check the error/empty flag or those to evaluate the the constant NULL expression. This speeds up common cases because the default ON ERROR / ON EMPTY behavior for JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps are necessary if the RETURNING type is a domain, as constraints on the domain may need to be checked. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* Revert recent SQL/JSON related commitsAmit Langote2024-09-06
| | | | | Reverts 68222851d5a8, 565caaa79af, and 3a97460970f, because a few BF animals didn't like one or all of them.
* SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY stepsAmit Langote2024-09-06
| | | | | | | | | | | | | | | | | When the ON ERROR / ON EMPTY behavior is to return NULL, returning NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's no need to create separate steps to check the error/empty flag or those to evaluate the the constant NULL expression. This speeds up common cases because the default ON ERROR / ON EMPTY behavior for JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps are necessary if the RETURNING type is a domain, as constraints on the domain may need to be checked. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* Fix misleading error message contextPeter Eisentraut2024-09-05
| | | | | | Author: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Stepan Neretin <sncfmgg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRAw+OkVW=FgMKHKyvY3CgtWy3cWdY7XT+S5TJaTttu=oA@mail.gmail.com
* Optimize WindowAgg's use of tuplestoresDavid Rowley2024-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When WindowAgg finished one partition of a PARTITION BY, it previously would call tuplestore_end() to purge all the stored tuples before again calling tuplestore_begin_heap() and carefully setting up all of the tuplestore read pointers exactly as required for the given frameOptions. Since the frameOptions don't change between partitions, this part does not make much sense. For queries that had very few rows per partition, the overhead of this was very large. It seems much better to create the tuplestore and the read pointers once and simply call tuplestore_clear() at the end of each partition. tuplestore_clear() moves all of the read pointers back to the start position and deletes all the previously stored tuples. A simple test query with 1 million partitions and 1 tuple per partition has been shown to run around 40% faster than without this change. The additional effort seems to have mostly been spent in malloc/free. Making this work required adding a new bool field to WindowAggState which had the unfortunate effect of being the 9th bool field in a group resulting in the struct being enlarged. Here we shuffle the fields around a little so that the two bool fields for runcondition relating stuff fit into existing padding. Also, move the "runcondition" field to be near those. This frees up enough space with the other bool fields so that the newly added one fits into the padding bytes. This was done to address a very small but apparent performance regression with queries containing a large number of rows per partition. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/CAHoyFK9n-QCXKTUWT_xxtXninSMEv%2BgbJN66-y6prM3f4WkEHw%40mail.gmail.com
* Speedup WindowAgg code by moving uncommon code out-of-lineDavid Rowley2024-09-05
| | | | | | | | | | The code to calculate the frame offsets is only performed once per scan. Moving this code out of line gives a small (around 4-5%) speedup when testing with some CPUs. Other tested CPUs are indifferent to the change. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/CAApHDvqPgFtwme2Zyf75BpMLwYr2mnUstDyPiP%3DEpudYuQTPPQ%40mail.gmail.com
* Fix typos and grammar in code comments and docsMichael Paquier2024-09-03
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
* Log the conflicts while applying changes in logical replication.Amit Kapila2024-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides the additional logging information in the following conflict scenarios while applying changes: insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint. update_differ: Updating a row that was previously modified by another origin. update_exists: The updated row value violates a NOT DEFERRABLE unique constraint. update_missing: The tuple to be updated is missing. delete_differ: Deleting a row that was previously modified by another origin. delete_missing: The tuple to be deleted is missing. For insert_exists and update_exists conflicts, the log can include the origin and commit timestamp details of the conflicting key with track_commit_timestamp enabled. update_differ and delete_differ conflicts can only be detected when track_commit_timestamp is enabled on the subscriber. We do not offer additional logging for exclusion constraint violations because these constraints can specify rules that are more complex than simple equality checks. Resolving such conflicts won't be straightforward. This area can be further enhanced if required. Author: Hou Zhijie Reviewed-by: Shveta Malik, Amit Kapila, Nisha Moond, Hayato Kuroda, Dilip Kumar Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Speed up Hash Join by making ExprStates support hashingDavid Rowley2024-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we add ExprState support for obtaining a 32-bit hash value from a list of expressions. This allows both faster hashing and also JIT compilation of these expressions. This is especially useful when hash joins have multiple join keys as the previous code called ExecEvalExpr on each hash join key individually and that was inefficient as tuple deformation would have only taken into account one key at a time, which could lead to walking the tuple once for each join key. With the new code, we'll determine the maximum attribute required and deform the tuple to that point only once. Some performance tests done with this change have shown up to a 20% performance increase of a query containing a Hash Join without JIT compilation and up to a 26% performance increase when JIT is enabled and optimization and inlining were performed by the JIT compiler. The performance increase with 1 join column was less with a 14% increase with and without JIT. This test was done using a fairly small hash table and a large number of hash probes. The increase will likely be less with large tables, especially ones larger than L3 cache as memory pressure is more likely to be the limiting factor there. This commit only addresses Hash Joins, but lays expression evaluation and JIT compilation infrastructure for other hashing needs such as Hash Aggregate. Author: David Rowley Reviewed-by: Alexey Dvoichenkov <alexey@hyperplane.net> Reviewed-by: Tels <nospam-pg-abuse@bloodgate.com> Discussion: https://postgr.es/m/CAApHDvoexAxgQFNQD_GRkr2O_eJUD1-wUGm%3Dm0L%2BGc%3DT%3DkEa4g%40mail.gmail.com
* Fix a series of typos and outdated referencesDavid Rowley2024-08-12
| | | | | Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
* Fix edge case in plpgsql's make_callstmt_target().Tom Lane2024-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the plancache entry for the CALL statement is already stale, it's possible for us to fetch an old procedure OID out of it, and then fail with "cache lookup failed for function NNN". In ordinary usage this never happens because make_callstmt_target is called just once immediately after building the plancache entry. It can be forced however by setting up an erroneous CALL (that causes make_callstmt_target itself to report an error), then dropping/recreating the target procedure, then repeating the erroneous CALL. To fix, use SPI_plan_get_cached_plan() to fetch the plancache's plan, rather than assuming we can use SPI_plan_get_plan_sources(). This shouldn't add any noticeable overhead in the normal case, and in the stale-plan case we'd have had to replan anyway a little further down. The other callers of SPI_plan_get_plan_sources() seem OK, because either they don't need up-to-date plans or they know that the query was just (re) planned. But add some commentary in hopes of not falling into this trap again. Per bug #18574 from Song Hongyu. Back-patch to v14 where this coding was introduced. (Older branches have comparable code, but it's run after any required replanning, so there's no issue.) Discussion: https://postgr.es/m/18574-2ce7ba3249221389@postgresql.org
* Refactor/reword some error messages to avoid duplicatesAlvaro Herrera2024-08-07
| | | | | | | Also, remove brackets around "EMPTY [ ARRAY ]". An error message is not the place to state that a keyword is optional. Backpatch to 17.
* Evaluate arguments of correlated SubPlans in the referencing ExprStateAndres Freund2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | Until now we generated an ExprState for each parameter to a SubPlan and evaluated them one-by-one ExecScanSubPlan. That's sub-optimal as creating lots of small ExprStates a) makes JIT compilation more expensive b) wastes memory c) is a bit slower to execute This commit arranges to evaluate parameters to a SubPlan as part of the ExprState referencing a SubPlan, using the new EEOP_PARAM_SET expression step. We emit one EEOP_PARAM_SET for each argument to a subplan, just before the EEOP_SUBPLAN step. It likely is worth using EEOP_PARAM_SET in other places as well, e.g. for SubPlan outputs, nestloop parameters and - more ambitiously - to get rid of ExprContext->domainValue/caseValue/ecxt_agg*. But that's for later. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Discussion: https://postgr.es/m/20230225214401.346ancgjqc3zmvek@awork3.anarazel.de
* SQL/JSON: Fix casting for integer EXISTS columns in JSON_TABLEAmit Langote2024-07-30
| | | | | | | | | | | | | | | | | | | | | | | | The current method of coercing the boolean result value of JsonPathExists() to the target type specified for an EXISTS column, which is to call the type's input function via json_populate_type(), leads to an error when the target type is integer, because the integer input function doesn't recognize boolean literal values as valid. Instead use the boolean-to-integer cast function for coercion in that case so that using integer or domains thereof as type for EXISTS columns works. Note that coercion for ON ERROR values TRUE and FALSE already works like that because the parser creates a cast expression including the cast function, but the coercion of the actual result value is not handled by the parser. Tests by Jian He. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* SQL/JSON: Remove useless code in ExecInitJsonExpr()Amit Langote2024-07-26
| | | | | | | | | The code was for adding an unconditional JUMP to the next step, which is unnecessary processing. Reported-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* SQL/JSON: Improve error-handling of JsonBehavior expressionsAmit Langote2024-07-26
| | | | | | | | | | | Instead of returning a NULL when the JsonBehavior expression value could not be coerced to the RETURNING type, throw the error message informing the user that it is the JsonBehavior expression that caused the error with the actual coercion error message shown in its DETAIL line. Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* SQL/JSON: Fix error-handling of some JsonBehavior expressionsAmit Langote2024-07-26
| | | | | | | | | | | To ensure that the errors of executing a JsonBehavior expression that is coerced in the parser are caught instead of being thrown directly, pass ErrorSaveContext to ExecInitExprRec() when initializing it. Also, add a EEOP_JSONEXPR_COERCION_FINISH step to handle the errors that are caught that way. Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17
* Propagate query IDs of utility statements in functionsMichael Paquier2024-07-19
| | | | | | | | | | | | | | | | | | | For utility statements defined within a function, the query tree is copied to a PlannedStmt as utility commands do not require planning. However, the query ID was missing from the information passed down. This leads to plugins relying on the query ID like pg_stat_statements to not be able to track utility statements within function calls. Tests are added to check this behavior, depending on pg_stat_statements.track. This is an old bug. Now, query IDs for utilities are compiled using their parsed trees rather than the query string since v16 (3db72ebcbe20), leading to less bloat with utilities, so backpatch down only to this version. Author: Anthonin Bonnefoy Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com Backpatch-through: 16
* Fix new assertion for MERGE view_name ... DO NOTHING.Noah Misch2024-07-13
| | | | | | | | | | | | | Such queries don't expand automatically updatable views, and ModifyTable uses the wholerow attribute unconditionally. The user-visible behavior is fine, so change to more-specific assertions. Commit d5f788b41dc2cbdde6e7694c70dda54d829a5ed5 added the wrong assertion. Back-patch to v17, where commit 5f2e179bd31e5f5803005101eb12a8d7bf8db8f3 introduced MERGE view_name. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/e4b40a88-c134-6926-3196-bc4501cb87a2@gmail.com
* Show Parallel Bitmap Heap Scan worker stats in EXPLAIN ANALYZEDavid Rowley2024-07-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Nodes like Memoize report the cache stats for each parallel worker, so it makes sense to show the exact and lossy pages in Parallel Bitmap Heap Scan in a similar way. Likewise, Sort shows the method and memory used for each worker. There was some discussion on whether the leader stats should include the totals for each parallel worker or not. I did some analysis on this to see what other parallel node types do and it seems only Parallel Hash does anything like this. All the rest, per what's supported by ExecParallelRetrieveInstrumentation() are consistent with each other. Author: David Geier <geidav.pg@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Donghang Lin <donghanglin@gmail.com> Author: Alena Rybakina <lena.ribackina@yandex.ru> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Michael Christofides <michael@pgmustard.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Donghang Lin <donghanglin@gmail.com> Reviewed-by: Masahiro Ikeda <Masahiro.Ikeda@nttdata.com> Discussion: https://postgr.es/m/b3d80961-c2e5-38cc-6a32-61886cdf766d%40gmail.com
* Fix right-anti-joins when the inner relation is proven uniqueRichard Guo2024-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | For an inner_unique join, we always assume that the executor will stop scanning for matches after the first match. Therefore, for a mergejoin that is inner_unique and whose mergeclauses are sufficient to identify a match, we set the skip_mark_restore flag to true, indicating that the executor need not do mark/restore calls. However, merge-right-anti-join did not get this memo and continues scanning the inner side for matches after the first match. If there are duplicates in the outer scan, we may incorrectly skip matching some inner tuples, which can lead to wrong results. Here we fix this issue by ensuring that merge-right-anti-join also advances to next outer tuple after the first match in inner_unique cases. This also saves cycles by avoiding unnecessary scanning of inner tuples after the first match. Although hash-right-anti-join does not suffer from this wrong results issue, we apply the same change to it as well, to help save cycles for the same reason. Per bug #18522 from Antti Lampinen, and bug #18526 from Feliphe Pozzer. Back-patch to v16 where right-anti-join was introduced. Author: Richard Guo Discussion: https://postgr.es/m/18522-c7a8956126afdfd0@postgresql.org
* Support "Right Semi Join" plan shapesRichard Guo2024-07-05
| | | | | | | | | | | | | | | | | | | | | | | Hash joins can support semijoin with the LHS input on the right, using the existing logic for inner join, combined with the assurance that only the first match for each inner tuple is considered, which can be achieved by leveraging the HEAP_TUPLE_HAS_MATCH flag. This can be very useful in some cases since we may now have the option to hash the smaller table instead of the larger. Merge join could likely support "Right Semi Join" too. However, the benefit of swapping inputs tends to be small here, so we do not address that in this patch. Note that this patch also modifies a test query in join.sql to ensure it continues testing as intended. With this patch the original query would result in a right-semi-join rather than semi-join, compromising its original purpose of testing the fix for neqjoinsel's behavior for semi-joins. Author: Richard Guo Reviewed-by: wenhui qiu, Alena Rybakina, Japin Li Discussion: https://postgr.es/m/CAMbWs4_X1mN=ic+SxcyymUqFx9bB8pqSLTGJ-F=MHy4PW3eRXw@mail.gmail.com
* Use TupleDescAttr macro consistentlyDavid Rowley2024-07-02
| | | | | | | A few places were directly accessing the attrs[] array. This goes against the standards set by 2cd708452. Fix that. Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
* SQL/JSON: Always coerce JsonExpr result at runtimeAmit Langote2024-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of looking up casts at parse time for converting the result of JsonPath* query functions to the specified or the default RETURNING type, always perform the conversion at runtime using either the target type's input function or the function json_populate_type(). There are two motivations for this change: 1. json_populate_type() coerces to types with typmod such that any string values that exceed length limit cause an error instead of silent truncation, which is necessary to be standard-conforming. 2. It was possible to end up with a cast expression that doesn't support soft handling of errors causing bugs in the of handling ON ERROR clause. JsonExpr.coercion_expr which would store the cast expression is no longer necessary, so remove. Bump catversion because stored rules change because of the above removal. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: Discussion: https://postgr.es/m/202405271326.5a5rprki64aw%40alvherre.pgsql
* Expand comments and add an assertion in nodeModifyTable.c.Noah Misch2024-06-27
| | | | | | | | | | | Most comments concern RELKIND_VIEW. One addresses the ExecUpdate() "tupleid" parameter. A later commit will rely on these facts, but they hold already. Back-patch to v12 (all supported versions), the plan for that commit. Reviewed (in an earlier version) by Robert Haas. Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
* Fix thinkos in commentsAlvaro Herrera2024-06-27
| | | | | The first one was noticed by Tender Wang and introduced with 8aba9322511f; the other one was newly introduced with dbca3469ebf8.
* Fix partition pruning setup during DETACH CONCURRENTLYAlvaro Herrera2024-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When detaching partition in concurrent mode, it's possible for partition descriptors to not match the set that was recently seen when the plan was made, causing an assertion failure or (in production builds) failure to construct a working plan. The case that was reported involves prepared statements, but I think it may be possible to hit this bug without that too. The problem is that CreatePartitionPruneState is constructing a PartitionPruneState under the assumption that new partitions can be added, but never removed, but it turns out that this isn't true: a prepared statement gets replanned when the DETACH CONCURRENTLY session sends out its invalidation message, but if the invalidation message arrives after ExecInitAppend started, we would build a partition descriptor without the partition, and then CreatePartitionPruneState would refuse to work with it. CreatePartitionPruneState already contains code to deal with the new descriptor having more partitions than before (and behaving for the extra partitions as if they had been pruned), but doesn't have code to deal with less partitions than before, and it is naïve about the case where the number of partitions is the same. We could simply add that a new stanza for less partitions than before, and in simple testing it works to do that; but it's possible to press the test scripts even further and hit the case where one partition is added and a partition is removed quickly enough that we see the same number of partitions, but they don't actually match, causing hangs during execution. To cope with both these problems, we now memcmp() the arrays of partition OIDs, and do a more elaborate mapping (relying on the fact that both OID arrays are in partition-bounds order) if they're not identical. This fix was already pushed in backbranches earlier. Reported-by: yajun Hu <1026592243@qq.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18377-e0324601cfebdfe5@postgresql.org
* Revert "Fix partition pruning setup during DETACH CONCURRENTLY"Alvaro Herrera2024-06-24
| | | | | | | This reverts commit 27162a64b386; this branch is in code freeze due to a nearing release. We can commit again after the release is out. Discussion: https://postgr.es/m/1158256.1719239648@sss.pgh.pa.us
* Fix partition pruning setup during DETACH CONCURRENTLYAlvaro Herrera2024-06-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When detaching partition in concurrent mode, it's possible for partition descriptors to not match the set that was recently seen when the plan was made, causing an assertion failure or (in production builds) failure to construct a working plan. The case that was reported involves prepared statements, but I think it may be possible to hit this bug without that too. The problem is that CreatePartitionPruneState is constructing a PartitionPruneState under the assumption that new partitions can be added, but never removed, but it turns out that this isn't true: a prepared statement gets replanned when the DETACH CONCURRENTLY session sends out its invalidation message, but if the invalidation message arrives after ExecInitAppend started, we would build a partition descriptor without the partition, and then CreatePartitionPruneState would refuse to work with it. CreatePartitionPruneState already contains code to deal with the new descriptor having more partitions than before (and behaving for the extra partitions as if they had been pruned), but doesn't have code to deal with less partitions than before, and it is naïve about the case where the number of partitions is the same. We could simply add that a new stanza for less partitions than before, and in simple testing it works to do that; but it's possible to press the test scripts even further and hit the case where one partition is added and a partition is removed quickly enough that we see the same number of partitions, but they don't actually match, causing hangs during execution. To cope with both these problems, we now memcmp() the arrays of partition OIDs, and do a more elaborate mapping (relying on the fact that both OID arrays are in partition-bounds order) if they're not identical. Backpatch to 14, where DETACH CONCURRENTLY appeared. Reported-by: yajun Hu <1026592243@qq.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18377-e0324601cfebdfe5@postgresql.org
* SQL/JSON: Correct jsonpath variable name matchingAmit Langote2024-06-19
| | | | | | | | | | | | | | | Previously, GetJsonPathVar() allowed a jsonpath expression to reference any prefix of a PASSING variable's name. For example, the following query would incorrectly work: SELECT JSON_QUERY(context_item, jsonpath '$xy' PASSING val AS xyz); The fix ensures that the length of the variable name mentioned in a jsonpath expression matches exactly with the name of the PASSING variable before comparing the strings using strncmp(). Reported-by: Alvaro Herrera (off-list) Discussion: https://postgr.es/m/CA+HiwqFGkLWMvELBH6E4SQ45qUHthgcRH6gCJL20OsYDRtFx_w@mail.gmail.com
* Fix behavior of stable functions called from a CALL's argument list.Tom Lane2024-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | If the CALL is within an atomic context (e.g. there's an outer transaction block), _SPI_execute_plan should acquire a fresh snapshot to execute any such functions with. We failed to do that and instead passed them the Portal snapshot, which had been acquired at the start of the current SQL command. This'd lead to seeing stale values of rows modified since the start of the command. This is arguably a bug in 84f5c2908: I failed to see that "are we in non-atomic mode" needs to be defined the same way as it is further down in _SPI_execute_plan, i.e. check !_SPI_current->atomic not just options->allow_nonatomic. Alternatively the blame could be laid on plpgsql, which is unconditionally passing allow_nonatomic = true for CALL/DO even when it knows it's in an atomic context. However, fixing it in spi.c seems like a better idea since that will also fix the problem for any extensions that may have copied plpgsql's coding pattern. While here, update an obsolete comment about _SPI_execute_plan's snapshot management. Per report from Victor Yegorov. Back-patch to all supported versions. Discussion: https://postgr.es/m/CAGnEboiRe+fG2QxuBO2390F7P8e2MQ6UyBjZSL_w1Cej+E4=Vw@mail.gmail.com
* Revert temporal primary keys and foreign keysPeter Eisentraut2024-05-16
| | | | | | | | | | | | | | | | | | | | This feature set did not handle empty ranges correctly, and it's now too late for PostgreSQL 17 to fix it. The following commits are reverted: 6db4598fcb8 Add stratnum GiST support function 46a0cd4cefb Add temporal PRIMARY KEY and UNIQUE constraints 86232a49a43 Fix comment on gist_stratnum_btree 030e10ff1a3 Rename pg_constraint.conwithoutoverlaps to conperiod a88c800deb6 Use daterange and YMD in without_overlaps tests instead of tsrange. 5577a71fb0c Use half-open interval notation in without_overlaps tests 34768ee3616 Add temporal FOREIGN KEY contraints 482e108cd38 Add test for REPLICA IDENTITY with a temporal key c3db1f30cba doc: clarify PERIOD and WITHOUT OVERLAPS in CREATE TABLE 144c2ce0cc7 Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes Discussion: https://www.postgresql.org/message-id/d0b64a7a-dfe4-4b84-a906-c7dedfa40a3e@eisentraut.org
* Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexesPeter Eisentraut2024-05-10
| | | | | | | | | | | | | | A PRIMARY KEY or UNIQUE constraint with WITHOUT OVERLAPS will be a GiST index, not a B-Tree, but it will still have indisunique set. The code for ON CONFLICT fails if it sees a non-btree index that has indisunique. This commit fixes that and adds some tests. But now that we can't just test indisunique, we also need some extra checks to prevent DO UPDATE from running against a WITHOUT OVERLAPS constraint (because the conflict could happen against more than one row, and we'd only update one). Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/1426589a-83cb-4a89-bf40-713970c07e63@illuminatedcomputing.com
* Fix an assortment of typosDavid Rowley2024-05-04
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
* Ensure we allocate NAMEDATALEN bytes for names in Index Only ScansDavid Rowley2024-05-01
| | | | | | | | | | | | | | | As an optimization, we store "name" columns as cstrings in btree indexes. Here we modify it so that Index Only Scans convert these cstrings back to names with NAMEDATALEN bytes rather than storing the cstring in the tuple slot, as was happening previously. Bug: #17855 Reported-by: Alexander Lakhin Reviewed-by: Alexander Lakhin, Tom Lane Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org Backpatch-through: 12, all supported versions
* Remove some unnecessary fields from executor nodes.Tom Lane2024-04-23
| | | | | | | | | | | | | | | | | | JsonExprState.input_finfo is only assigned to, never read, and it's really fairly useless since the value can be gotten out of the adjacent input_fcinfo field. Let's remove it before someone starts to depend on it. While here, also remove TidScanState.tss_htup and AggState.combinedproj, which are referenced nowhere. Those should have been removed by the commits that caused them to become disused, but were not. I don't think a catversion bump is necessary here, since plan trees are never stored on disk. Matthias van de Meent Discussion: https://postgr.es/m/CAEze2WjsY4d0TBymLNGK4zpttUcg_YZaTjyWz2VfDUV6YH8wXQ@mail.gmail.com
* Fix typos and duplicate wordsDaniel Gustafsson2024-04-18
| | | | | | | | | | | | This fixes various typos, duplicated words, and tiny bits of whitespace mainly in code comments but also in docs. Author: Daniel Gustafsson <daniel@yesql.se> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
* SQL/JSON: Improve some error messagesAmit Langote2024-04-18
| | | | | | | | | | | | | | | This improves some error messages emitted by SQL/JSON query functions by mentioning column name when available, such as when they are invoked as part of evaluating JSON_TABLE() columns. To do so, a new field column_name is added to both JsonFuncExpr and JsonExpr that is only populated when creating those nodes for transformed JSON_TABLE() columns. While at it, relevant error messages are reworded for clarity. Reported-by: Jian He <jian.universality@gmail.com> Suggested-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxG_e0QLCgaELrr2ZNz7AxPeGCNKAORe3fHtFCQLsH4J4Q@mail.gmail.com