aboutsummaryrefslogtreecommitdiff
path: root/src/include/parser/parse_node.h
Commit message (Collapse)AuthorAge
* Revert support for improved tracking of nested queriesMichael Paquier2025-06-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit reverts the two following commits: - 499edb09741b, track more precisely query locations for nested statements. - 06450c7b8c70, a follow-up fix of 499edb09741b with query locations. The test introduced in this commit is not reverted. This is proving useful to track a problem that only pgaudit was able to detect. These prove to have issues with the tracking of SELECT statements, when these use multiple parenthesis which is something supported by the grammar. Incorrect location and lengths are causing pg_stat_statements to become confused, failing its job in query normalization with potential out-of-bound writes because the location and the length may not match with what can be handled. A lot of the query patterns discussed when this issue was reported have no test coverage in the main regression test suite, or the recovery test 027_stream_regress.pl would have caught the problems as pg_stat_statements is loaded by the node running the regression tests. A first step would be to improve the test coverage to stress more the query normalization logic. A different portion of this work was done in 45e0ba30fc40, with the addition of tests for nested queries. These can be left in the tree. They are useful to track the way inner queries are currently tracked by PGSS with non-top-level entries, and will be useful when reconsidering in the future the work reverted here. Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru> Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
* Add OLD/NEW support to RETURNING in DML queries.Dean Rasheed2025-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries to explicitly return old and new values by using the special aliases "old" and "new", which are automatically added to the query (if not already defined) while parsing its RETURNING list, allowing things like: RETURNING old.colname, new.colname, ... RETURNING old.*, new.* Additionally, a new syntax is supported, allowing the names "old" and "new" to be changed to user-supplied alias names, e.g.: RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ... This is useful when the names "old" and "new" are already defined, such as inside trigger functions, allowing backwards compatibility to be maintained -- the interpretation of any existing queries that happen to already refer to relations called "old" or "new", or use those as aliases for other relations, is not changed. For an INSERT, old values will generally be NULL, and for a DELETE, new values will generally be NULL, but that may change for an INSERT with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule changes the command type. Therefore, we put no restrictions on the use of old and new in any DML queries. Dean Rasheed, reviewed by Jian He and Jeff Davis. Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Track more precisely query locations for nested statementsMichael Paquier2024-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, a Query generated through the transform phase would have unset stmt_location, tracking the starting point of a query string. Extensions relying on the statement location to extract its relevant parts in the source text string would fallback to use the whole statement instead, leading to confusing results like in pg_stat_statements for queries relying on nested queries, like: - EXPLAIN, with top-level and nested query using the same query string, and a query ID coming from the nested query when the non-top-level entry. - Multi-statements, with only partial portions of queries being normalized. - COPY TO with a query, SELECT or DMLs. This patch improves things by keeping track of the statement locations and propagate it to Query during transform, allowing PGSS to only show the relevant part of the query for nested query. This leads to less bloat in entries for non-top-level entries, as queries can now be grouped within the same (toplevel, queryid) duos in pg_stat_statements. The result gives a stricter one-one mapping between query IDs and its query strings. The regression tests introduced in 45e0ba30fc40 produce differences reflecting the new logic. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
* Introduce an RTE for the grouping stepRichard Guo2024-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are subqueries in the grouping expressions, each of these subqueries in the targetlist and HAVING clause is expanded into distinct SubPlan nodes. As a result, only one of these SubPlan nodes would be converted to reference to the grouping key column output by the Agg node; others would have to get evaluated afresh. This is not efficient, and with grouping sets this can cause wrong results issues in cases where they should go to NULL because they are from the wrong grouping set. Furthermore, during re-evaluation, these SubPlan nodes might use nulled column values from grouping sets, which is not correct. This issue is not limited to subqueries. For other types of expressions that are part of grouping items, if they are transformed into another form during preprocessing, they may fail to match lower target items. This can also lead to wrong results with grouping sets. To fix this issue, we introduce a new kind of RTE representing the output of the grouping step, with columns that are the Vars or expressions being grouped on. In the parser, we replace the grouping expressions in the targetlist and HAVING clause with Vars referencing this new RTE, so that the output of the parser directly expresses the semantic requirement that the grouping expressions be gotten from the grouping output rather than computed some other way. In the planner, we first preprocess all the columns of this new RTE and then replace any Vars in the targetlist and HAVING clause that reference this new RTE with the underlying grouping expressions, so that we will have only one instance of a SubPlan node for each subquery contained in the grouping expressions. Bump catversion because this changes the querytree produced by the parser. Thanks to Tom Lane for the idea to invent a new kind of RTE. Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from various threads. Author: Richard Guo Reviewed-by: Ashutosh Bapat, Sutou Kouhei Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
* Add RETURNING support to MERGE.Dean Rasheed2024-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows a RETURNING clause to be appended to a MERGE query, to return values based on each row inserted, updated, or deleted. As with plain INSERT, UPDATE, and DELETE commands, the returned values are based on the new contents of the target table for INSERT and UPDATE actions, and on its old contents for DELETE actions. Values from the source relation may also be returned. As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be used as the source relation for other operations such as WITH queries and COPY commands. Additionally, a special function merge_action() is provided, which returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action executed for each row. The merge_action() function can be used anywhere in the RETURNING list, including in arbitrary expressions and subqueries, but it is an error to use it anywhere outside of a MERGE query's RETURNING list. Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera, Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut, and Wolfgang Walther. Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Make Vars be outer-join-aware.Tom Lane2023-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Rework query relation permission checkingAlvaro Herrera2022-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
* Make subquery aliases optional in the FROM clause.Dean Rasheed2022-07-20
| | | | | | | | | | | | | | This allows aliases for sub-SELECTs and VALUES clauses in the FROM clause to be omitted. This is an extension of the SQL standard, supported by some other database systems, and so eases the transition from such systems, as well as removing the minor inconvenience caused by requiring these aliases. Patch by me, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUCGCf82=hxd9N5n6xGHPyYpQnxW8HneeH+uP7yNALkWA@mail.gmail.com
* Add support for MERGE SQL commandAlvaro Herrera2022-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Remove Value node structPeter Eisentraut2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Value node struct is a weird construct. It is its own node type, but most of the time, it actually has a node type of Integer, Float, String, or BitString. As a consequence, the struct name and the node type don't match most of the time, and so it has to be treated specially a lot. There doesn't seem to be any value in the special construct. There is very little code that wants to accept all Value variants but nothing else (and even if it did, this doesn't provide any convenient way to check it), and most code wants either just one particular node type (usually String), or it accepts a broader set of node types besides just Value. This change removes the Value struct and node type and replaces them by separate Integer, Float, String, and BitString node types that are proper node types and structs of their own and behave mostly like normal node types. Also, this removes the T_Null node tag, which was previously also a possible variant of Value but wasn't actually used outside of the Value contained in A_Const. Replace that by an isnull field in A_Const. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
* Allow an alias to be attached to a JOIN ... USINGPeter Eisentraut2021-03-31
| | | | | | | | | | | | | | | This allows something like SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x where x has the columns a, b, c and unlike a regular alias it does not hide the range variables of the tables being joined t1 and t2. Per SQL:2016 feature F404 "Range variable for common column names". Reviewed-by: Vik Fearing <vik.fearing@2ndquadrant.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-2564428062af@2ndquadrant.com
* Add p_names field to ParseNamespaceItemPeter Eisentraut2021-03-31
| | | | | | | | | | | | ParseNamespaceItem had a wired-in assumption that p_rte->eref describes the table and column aliases exposed by the nsitem. This relaxes this by creating a separate p_names field in an nsitem. This is mainly preparation for a patch for JOIN USING aliases, but it saves one indirection in common code paths, so it's possibly a win on its own. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/785329.1616455091@sss.pgh.pa.us
* Extended statistics on expressionsTomas Vondra2021-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
* SEARCH and CYCLE clausesPeter Eisentraut2021-02-01
| | | | | | | | | | | | This adds the SQL standard feature that adds the SEARCH and CYCLE clauses to recursive queries to be able to do produce breadth- or depth-first search orders and detect cycles. These clauses can be rewritten into queries using existing syntax, and that is what this patch does in the rewriter. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Support subscripting of arbitrary types, not only arrays.Tom Lane2020-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
* Go back to returning int from ereport auxiliary functions.Tom Lane2020-03-25
| | | | | | | | | | | | | | | | | | | This reverts the parts of commit 17a28b03645e27d73bf69a95d7569b61e58f06eb that changed ereport's auxiliary functions from returning dummy integer values to returning void. It turns out that a minority of compilers complain (not entirely unreasonably) about constructs such as (condition) ? errdetail(...) : 0 if errdetail() returns void rather than int. We could update those call sites to say "(void) 0" perhaps, but the expectation for this patch set was that ereport callers would not have to change anything. And this aspect of the patch set was already the most invasive and least compelling part of it, so let's just drop it. Per buildfarm. Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
* Improve the internal implementation of ereport().Tom Lane2020-03-24
| | | | | | | | | | | | | | | | | | | | | | Change all the auxiliary error-reporting routines to return void, now that we no longer need to pretend they are passing something useful to errfinish(). While this probably doesn't save anything significant at the machine-code level, it allows detection of some additional types of mistakes. Pass the error location details (__FILE__, __LINE__, PG_FUNCNAME_MACRO) to errfinish not errstart. This shaves a few cycles off the case where errstart decides we're not going to emit anything. Re-implement elog() as a trivial wrapper around ereport(), removing the separate support infrastructure it used to have. Aside from getting rid of some now-surplus code, this means that elog() now really does have exactly the same semantics as ereport(), in particular that it can skip evaluation work if the message is not to be emitted. Andres Freund and Tom Lane Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
* Make parser rely more heavily on the ParseNamespaceItem data structure.Tom Lane2020-01-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I added the ParseNamespaceItem data structure (in commit 5ebaaa494), it wasn't very tightly integrated into the parser's APIs. In the wake of adding p_rtindex to that struct (commit b541e9acc), there is a good reason to make more use of it: by passing around ParseNamespaceItem pointers instead of bare RTE pointers, we can get rid of various messy methods for passing back or deducing the rangetable index of an RTE during parsing. Hence, refactor the addRangeTableEntryXXX functions to build and return a ParseNamespaceItem struct, not just the RTE proper; and replace addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem rather than building one internally. Also, add per-column data (a ParseNamespaceColumn array) to each ParseNamespaceItem. These arrays are built during addRangeTableEntryXXX, where we have column type data at hand so that it's nearly free to fill the data structure. Later, when we need to build Vars referencing RTEs, we can use the ParseNamespaceColumn info to avoid the rather expensive operations done in get_rte_attribute_type() or expandRTE(). get_rte_attribute_type() is indeed dead code now, so I've removed it. This makes for a useful improvement in parse analysis speed, around 20% in one moderately-complex test query. The ParseNamespaceColumn structs also include Var identity information (varno/varattno). That info isn't actually being used in this patch, except that p_varno == 0 is a handy test for a dropped column. A follow-on patch will make more use of it. Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Refactor parser's generation of Var nodes.Tom Lane2019-12-26
| | | | | | | | | | | | | | | | | | | | | Instead of passing around a pointer to the RangeTblEntry that provides the desired column, pass a pointer to the associated ParseNamespaceItem. The RTE is trivially reachable from the nsitem, and having the ParseNamespaceItem allows access to additional information. As proof of concept for that, add the rangetable index to ParseNamespaceItem, and use that to get rid of RTERangeTablePosn searches. (I have in mind to teach the parser to generate some different representation for Vars that are nullable by outer joins, and keeping the necessary information in ParseNamespaceItems seems like a reasonable approach to that. But whether that ever happens or not, this seems like good cleanup.) Also refactor the code around scanRTEForColumn so that the "fuzzy match" stuff does not leak out of parse_relation.c. Discussion: https://postgr.es/m/26144.1576858373@sss.pgh.pa.us
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Generated columnsPeter Eisentraut2019-03-30
| | | | | | | | | | | | | | This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
* Renaming for new subscripting mechanismAlvaro Herrera2019-02-01
| | | | | | | | | | | | Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to introduce a more generic subscription mechanism, which allows subscripting not only arrays but also other object types such as JSONB. That functionality is introduced in a largish invasive patch, out of which this internal renaming patch was extracted. Author: Dmitry Dolgov Reviewed-by: Tom Lane, Arthur Zakirov Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
* Allow generalized expression syntax for partition boundsPeter Eisentraut2019-01-25
| | | | | | | | | | | | | | Previously, only literals were allowed. This change allows general expressions, including functions calls, which are evaluated at the time the DDL command is executed. Besides offering some more functionality, it simplifies the parser structures and removes some inconsistencies in how the literals were handled. Author: Kyotaro Horiguchi, Tom Lane, Amit Langote Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/9f88b5e0-6da2-5227-20d0-0d7012beaa1c@lab.ntt.co.jp/
* Allow COPY FROM to filter data using WHERE conditionsTomas Vondra2019-01-20
| | | | | | | | | | | | | Extends the COPY FROM command with a WHERE condition, which allows doing various types of filtering while importing the data (random sampling, condition on a data column, etc.). Until now such filtering required either preprocessing of the input data, or importing all data and then filtering in the database. COPY FROM ... WHERE is an easy-to-use and low-overhead alternative for most simple cases. Author: Surafel Temesgen Reviewed-by: Tomas Vondra, Masahiko Sawada, Lim Myungkyu Discussion: https://www.postgresql.org/message-id/flat/CALAY4q_DdpWDuB5-Zyi-oTtO2uSk8pmy+dupiRe3AvAc++1imA@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Revert MERGE patchSimon Riggs2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commits d204ef63776b8a00ca220adec23979091564e465, 83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter (complete list at the end) related to MERGE feature. While the feature was fully functional, with sufficient test coverage and necessary documentation, it was felt that some parts of the executor and parse-analyzer can use a different design and it wasn't possible to do that in the available time. So it was decided to revert the patch for PG11 and retry again in the future. Thanks again to all reviewers and bug reporters. List of commits reverted, in reverse chronological order: f1464c5380 Improve parse representation for MERGE ddb4158579 MERGE syntax diagram correction 530e69e59b Allow cpluspluscheck to pass by renaming variable 01b88b4df5 MERGE minor errata 3af7b2b0d4 MERGE fix variable warning in non-assert builds a5d86181ec MERGE INSERT allows only one VALUES clause 4b2d44031f MERGE post-commit review 4923550c20 Tab completion for MERGE aa3faa3c7a WITH support in MERGE 83454e3c2b New files for MERGE d204ef6377 MERGE SQL Command following SQL:2016 Author: Pavan Deolasee Reviewed-by: Michael Paquier
* MERGE SQL Command following SQL:2016Simon Riggs2018-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
* Revert "Modified files for MERGE"Simon Riggs2018-04-02
| | | | This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
* Modified files for MERGESimon Riggs2018-04-02
|
* Fix oversight in CALL argument handling, and do some minor cleanup.Tom Lane2018-02-10
| | | | | | | | | | | | | | | | | | | CALL statements cannot support sub-SELECTs in the arguments of the called procedure, since they just use ExecEvalExpr to evaluate such arguments. Teach transformSubLink() to reject the case, as it already does for other contexts in which subqueries are not supported. In passing, s/EXPR_KIND_CALL/EXPR_KIND_CALL_ARGUMENT/ to make that enum symbol line up more closely with the phrasing of the error messages it is associated with. And fix someone's weak grasp of English grammar in the preceding EXPR_KIND_PARTITION_EXPRESSION addition. Also update an incorrect comment in resolve_unique_index_expr (possibly it was correct when written, but nowadays transformExpr definitely does reject SRFs here). Per report from Pavel Stehule --- but this resolves only one of the bugs he mentions. Discussion: https://postgr.es/m/CAFj8pRDxOwPPzpA8i+AQeDQFj7bhVw-dR2==rfWZ3zMGkm568Q@mail.gmail.com
* Support all SQL:2011 options for window frame clauses.Tom Lane2018-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* SQL proceduresPeter Eisentraut2017-11-30
| | | | | | | | | | | | | | | | | | | | | This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Disallow set-returning functions inside CASE or COALESCE.Tom Lane2017-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we reimplemented SRFs in commit 69f4b9c85, our initial choice was to allow the behavior to vary from historical practice in cases where a SRF call appeared within a conditional-execution construct (currently, only CASE or COALESCE). But that was controversial to begin with, and subsequent discussion has resulted in a consensus that it's better to throw an error instead of executing the query differently from before, so long as we can provide a reasonably clear error message and a way to rewrite the query. Hence, add a parser mechanism to allow detection of such cases during parse analysis. The mechanism just requires storing, in the ParseState, a pointer to the set-returning FuncExpr or OpExpr most recently emitted by parse analysis. Then the parsing functions for CASE and COALESCE can detect the presence of a SRF in their arguments by noting whether this pointer changes while analyzing their arguments. Furthermore, if it does, it provides a suitable error cursor location for the complaint. (This means that if there's more than one SRF in the arguments, the error will point at the last one to be analyzed not the first. While connoisseurs of parsing behavior might find that odd, it's unlikely the average user would ever notice.) While at it, we can also provide more specific error messages than before about some pre-existing restrictions, such as no-SRFs-within-aggregates. Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM construct would need to return a set. We've never supported that, but the restriction is depended on in more subtle ways now, so it seems wise to detect it at the start. Also, provide some documentation about how to rewrite a SRF-within-CASE query using a custom wrapper SRF. It turns out that the information_schema.user_mapping_options view contained an instance of exactly the behavior we're now forbidding; but rewriting it makes it more clear and safer too. initdb forced because of user_mapping_options change. Patch by me, with error message suggestions from Alvaro Herrera and Andres Freund, pursuant to a complaint from Regina Obe. Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Add infrastructure to support EphemeralNamedRelation references.Kevin Grittner2017-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others
* Change unknown-type literals to type text in SELECT and RETURNING lists.Tom Lane2017-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we left such literals alone if the query or subquery had no properties forcing a type decision to be made (such as an ORDER BY or DISTINCT clause using that output column). This meant that "unknown" could be an exposed output column type, which has never been a great idea because it could result in strange failures later on. For example, an outer query that tried to do any operations on an unknown-type subquery output would generally fail with some weird error like "failed to find conversion function from unknown to text" or "could not determine which collation to use for string comparison". Also, if the case occurred in a CREATE VIEW's query then the view would have an unknown-type column, causing similar failures in queries trying to use the view. To fix, at the tail end of parse analysis of a query, forcibly convert any remaining "unknown" literals in its SELECT or RETURNING list to type text. However, provide a switch to suppress that, and use it in the cases of SELECT inside a set operation or INSERT command. In those cases we already had type resolution rules that make use of context information from outside the subquery proper, and we don't want to change that behavior. Also, change creation of an unknown-type column in a relation from a warning to a hard error. The error should be unreachable now in CREATE VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown" in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because it's nothing but a foot-gun. This change creates a pg_upgrade failure case: a matview that contains an unknown-type column can't be pg_upgraded, because reparsing the matview's defining query will now decide that the column is of type text, which doesn't match the cstring-like storage that the old materialized column would actually have. Add a checking pass to detect that. While at it, we can detect tables or composite types that would fail, essentially for free. Those would fail safely anyway later on, but we might as well fail earlier. This patch is by me, but it owes something to previous investigations by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for review. Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
* Fix check_srf_call_placement() to handle VALUES cases correctly.Tom Lane2017-01-16
| | | | | | | | | | | | | | | | | | | | | INSERT ... VALUES with a single VALUES row is implemented quite differently from the general VALUES case. A user-visible implication of that is that we accept SRFs in the single-row case, but not in the multi-row case. That's a historical artifact no doubt, but in view of the lack of field complaints, I'm not excited about fixing it right now. However, check_srf_call_placement() needs to know about this, first because it should throw an error in the unsupported case, and second because it should set p_hasTargetSRFs in the single-row case (because we treat that like a SELECT tlist). That's an oversight in commit a4c35ea1c. To fix, split EXPR_KIND_VALUES into two values. So far as I can see, this is the only place where we need to distinguish the two cases at present; but there might be more later. Patch by me, per report from Andres Freund. Discussion: https://postgr.es/m/20170116081548.zg63zltblwimpfgp@alap3.anarazel.de
* Get rid of ParseState.p_value_substitute; use a columnref hook instead.Tom Lane2017-01-07
| | | | | | | | I noticed that p_value_substitute, which is a single-purpose kluge I added in 2002 (commit b0422b215), could be replaced by having domainAddConstraint install a parser hook that looks for the name "value". The parser hook code only dates back to 2009, so it's not surprising that we had to kluge this in 2002, but we can do it more cleanly now.
* Improve documentation of struct ParseState.Tom Lane2017-01-07
| | | | | | | | | I got annoyed about how some fields of ParseState were documented in the struct's block comment and some weren't; not all of the latter are trivial. Fix that. Also reorder a couple of fields that seem to have been placed rather randomly, or maybe with an idea of avoiding padding space; but there are never so many ParseStates in existence at one time that we ought to value pad space over readability.