aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/ruleutils.c
Commit message (Collapse)AuthorAge
* Remove race condition in pg_get_expr().Tom Lane2024-02-09
| | | | | | | | | | | | | | | | | | | | | | | Since its introduction, pg_get_expr() has intended to silently return NULL if called with an invalid relation OID, as can happen when scanning the catalogs concurrently with relation drops. However, there is a race condition: we check validity of the OID at the start, but it could get dropped just afterward, leading to failures. This is the cause of some intermittent instability we're seeing in a proposed new test case, and presumably it's a hazard in the field as well. We can fix this by AccessShareLock-ing the target relation for the duration of pg_get_expr(). Since we don't require any permissions on the target relation, this is semantically a bit undesirable. But it turns out that the set_relation_column_names() subroutine already takes a transient AccessShareLock on that relation, and has done since commit 2ffa740be in 2012. Given the lack of complaints about that, it seems like there should be no harm in holding the lock a bit longer. Back-patch to all supported branches. Discussion: https://postgr.es/m/31ddcc01-a71b-4e8c-9948-01d1c47293ca@eisentraut.org
* Track nesting depth correctly when drilling down into RECORD Vars.Tom Lane2023-09-15
| | | | | | | | | | | | | | | | | | expandRecordVariable() failed to adjust the parse nesting structure correctly when recursing to inspect an outer-level Var. This could result in assertion failures or core dumps in corner cases. Likewise, get_name_for_var_field() failed to adjust the deparse namespace stack correctly when recursing to inspect an outer-level Var. In this case the likely result was a "bogus varno" error while deparsing a view. Per bug #18077 from Jingzhou Fu. Back-patch to all supported branches. Richard Guo, with some adjustments by me Discussion: https://postgr.es/m/18077-b9db97c6e0ab45d8@postgresql.org
* Fix get_expr_result_type() to find field names for RECORD Consts.Tom Lane2023-09-15
| | | | | | | | | | | | | | | | | This is a back-patch of commit d57534740 ("Fix EXPLAIN of SEARCH BREADTH FIRST with a constant initial value") into pre-v14 branches. At the time I'd thought it was not needed in branches that lack the SEARCH/CYCLE feature, but that was just a failure of imagination. It's possible to demonstrate "record type has not been registered" failures in older branches too, during deparsing of views that contain references to fields of composite constants. Back-patch only the code changes, as the test cases added by d57534740 all require SEARCH/CYCLE syntax. A suitable test case will be added in the upcoming fix for bug #18077. Discussion: https://postgr.es/m/17644-3bd1f3036d6d7a16@postgresql.org Discussion: https://postgr.es/m/3607145.1694803130@sss.pgh.pa.us
* Print the correct aliases for DML target tables in ruleutils.Tom Lane2023-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | ruleutils.c blindly printed the user-given alias (or nothing if there hadn't been one) for the target table of INSERT/UPDATE/DELETE queries. That works a large percentage of the time, but not always: for queries appearing in WITH, it's possible that we chose a different alias to avoid conflict with outer-scope names. Since the chosen alias would be used in any Var references to the target table, this'd lead to an inconsistent printout with consequences such as dump/restore failures. The correct logic for printing (or not) a relation alias was embedded in get_from_clause_item. Factor it out to a separate function so that we don't need a jointree node to use it. (Only a limited part of that function can be reached from these new call sites, but this seems like the cleanest non-duplicative factorization.) In passing, I got rid of a redundant "\d+ rules_src" step in rules.sql. Initial report from Jonathan Katz; thanks to Vignesh C for analysis. This has been broken for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/e947fa21-24b2-f922-375a-d4f763ef3e4b@postgresql.org Discussion: https://postgr.es/m/CALDaNm1MMntjmT_NJGp-Z=xbF02qHGAyuSHfYHias3TqQbPF2w@mail.gmail.com
* Fix ruleutils issues with dropped cols in functions-returning-composite.Tom Lane2022-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
* Show 'AS "?column?"' explicitly when it's important.Tom Lane2022-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | ruleutils.c was coded to suppress the AS label for a SELECT output expression if the column name is "?column?", which is the parser's fallback if it can't think of something better. This is fine, and avoids ugly clutter, so long as (1) nothing further up in the parse tree relies on that column name or (2) the same fallback would be assigned when the rule or view definition is reloaded. Unfortunately (2) is far from certain, both because ruleutils.c might print the expression in a different form from how it was originally written and because FigureColname's rules might change in future releases. So we shouldn't rely on that. Detecting exactly whether there is any outer-level use of a SELECT column name would be rather expensive. This patch takes the simpler approach of just passing down a flag indicating whether there *could* be any outer use; for example, the output column names of a SubLink are not referenceable, and we also do not care about the names exposed by the right-hand side of a setop. This is sufficient to suppress unwanted clutter in all but one case in the regression tests. That seems like reasonable evidence that it won't be too much in users' faces, while still fixing the cases we need to fix. Per bug #17486 from Nicolas Lutic. This issue is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org
* Fix assorted missing logic for GroupingFunc nodes.Tom Lane2022-03-21
| | | | | | | | | | | | | | | | | | | | | | The planner needs to treat GroupingFunc like Aggref for many purposes, in particular with respect to processing of the argument expressions, which are not to be evaluated at runtime. A few places hadn't gotten that memo, notably including subselect.c's processing of outer-level aggregates. This resulted in assertion failures or wrong plans for cases in which a GROUPING() construct references an outer aggregation level. Also fix missing special cases for GroupingFunc in cost_qual_eval (resulting in wrong cost estimates for GROUPING(), although it's not clear that that would affect plan shapes in practice) and in ruleutils.c (resulting in excess parentheses in pretty-print mode). Per bug #17088 from Yaoguang Chen. Back-patch to all supported branches. Richard Guo, Tom Lane Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org
* Fix ruleutils.c's dumping of whole-row Vars in more contexts.Tom Lane2022-01-13
| | | | | | | | | | | | | | | | | | Commit 7745bc352 intended to ensure that whole-row Vars would be printed with "::type" decoration in all contexts where plain "var.*" notation would result in star-expansion, notably in ROW() and VALUES() constructs. However, it missed the case of INSERT with a single-row VALUES, as reported by Timur Khanjanov. Nosing around ruleutils.c, I found a second oversight: the code for RowCompareExpr generates ROW() notation without benefit of an actual RowExpr, and naturally it wasn't in sync :-(. (The code for FieldStore also does this, but we don't expect that to generate strictly parsable SQL anyway, so I left it alone.) Back-patch to all supported branches. Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
* Fix broken ruleutils support for function TRANSFORM clauses.Tom Lane2021-01-25
| | | | | | | | I chanced to notice that this dumped core due to a faulty Assert. To add insult to injury, the output has been misformatted since v11. Obviously we need some regression testing here. Discussion: https://postgr.es/m/d1cc628c-3953-4209-957b-29427acc38c8@www.fastmail.com
* Further fix dumping of views that contain just VALUES(...).Tom Lane2019-11-16
| | | | | | | | | | | | | | | | | | | | | | | It turns out that commit e9f1c01b7 missed a case: we must print a VALUES clause in long format if get_query_def is given a resultDesc that would require the query's output column name(s) to be different from what the bare VALUES clause would produce. This applies in case an ALTER ... RENAME COLUMN has been done to a view that formerly could be printed in simple format, as shown in the added regression test case. It also explains bug #16119 from Dmitry Telpt, because it turns out that (unlike CREATE VIEW) CREATE MATERIALIZED VIEW fails to apply any column aliases it's given to the stored ON SELECT rule. So to get them to be printed, we have to account for the resultDesc renaming. It might be worth changing the matview code so that it creates the ON SELECT rule with the correct aliases; but we'd still need these messy checks in get_simple_values_rte to handle the case of a subsequent column rename, so any such change would be just neatnik-ism not a bug fix. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/16119-e64823f30a45a754@postgresql.org
* Require the schema qualification in pg_temp.type_name(arg).Noah Misch2019-08-05
| | | | | | | | | | | | Commit aa27977fe21a7dfa4da4376ad66ae37cb8f0d0b5 introduced this restriction for pg_temp.function_name(arg); do likewise for types created in temporary schemas. Programs that this breaks should add "pg_temp." schema qualification or switch to arg::type_name syntax. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Reported by Tom Lane. Security: CVE-2019-10208
* Fix incorrect printing of queries with duplicated join names.Tom Lane2019-06-12
| | | | | | | | | | | | | | | | | Given a query in which multiple JOIN nodes used the same alias (which'd necessarily be in different sub-SELECTs), ruleutils.c would assign the JOIN nodes distinct aliases for clarity ... but then it forgot to print the modified aliases when dumping the JOIN nodes themselves. This results in a dump/reload hazard for views, because the emitted query is flat-out incorrect: Vars will be printed with table names that have no referent. This has been wrong for a long time, so back-patch to all supported branches. Philip Dubé Discussion: https://postgr.es/m/CY4PR2101MB080246F2955FF58A6ED1FEAC98140@CY4PR2101MB0802.namprd21.prod.outlook.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Fix tablespace inheritance for partitioned relsAlvaro Herrera2019-04-25
| | | | | | | | | | | | | | | | | | | | | | | | Commit ca4103025dfe left a few loose ends. The most important one (broken pg_dump output) is already fixed by virtue of commit 3b23552ad8bb, but some things remained: * When ALTER TABLE rewrites tables, the indexes must remain in the tablespace they were originally in. This didn't work because index recreation during ALTER TABLE runs manufactured SQL (yuck), which runs afoul of default_tablespace in competition with the parent relation tablespace. To fix, reset default_tablespace to the empty string temporarily, and add the TABLESPACE clause as appropriate. * Setting a partitioned rel's tablespace to the database default is confusing; if it worked, it would direct the partitions to that tablespace regardless of default_tablespace. But in reality it does not work, and making it work is a larger project. Therefore, throw an error when this condition is detected, to alert the unwary. Add some docs and tests, too. Author: Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com
* Avoid fetching past the end of the indoption array.Tom Lane2019-04-07
| | | | | | | | | | | pg_get_indexdef_worker carelessly fetched indoption entries even for non-key index columns that don't have one. 99.999% of the time this would be harmless, since the code wouldn't examine the value ... but some fine day this will be a fetch off the end of memory, resulting in SIGSEGV. Detected through valgrind testing. Odd that the buildfarm's valgrind critters haven't noticed.
* Support foreign keys that reference partitioned tablesAlvaro Herrera2019-04-03
| | | | | | | | | | Previously, while primary keys could be made on partitioned tables, it was not possible to define foreign keys that reference those primary keys. Now it is possible to do that. Author: Álvaro Herrera Reviewed-by: Amit Langote, Jesper Pedersen Discussion: https://postgr.es/m/20181102234158.735b3fevta63msbj@alvherre.pgsql
* Add support for multivariate MCV listsTomas Vondra2019-03-27
| | | | | | | | | | | | | | | | Introduce a third extended statistic type, supported by the CREATE STATISTICS command - MCV lists, a generalization of the statistic already built and used for individual columns. Compared to the already supported types (n-distinct coefficients and functional dependencies), MCV lists are more complex, include column values and allow estimation of much wider range of common clauses (equality and inequality conditions, IS NULL, IS NOT NULL etc.). Similarly to the other types, a new pseudo-type (pg_mcv_list) is used. Author: Tomas Vondra Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
* Standardize some more loops that chase down parallel lists.Tom Lane2019-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | We have forboth() and forthree() macros that simplify iterating through several parallel lists, but not everyplace that could reasonably use those was doing so. Also invent forfour() and forfive() macros to do the same for four or five parallel lists, and use those where applicable. The immediate motivation for doing this is to reduce the number of ad-hoc lnext() calls, to reduce the footprint of a WIP patch. However, it seems like good cleanup and error-proofing anyway; the places that were combining forthree() with a manually iterated loop seem particularly illegible and bug-prone. There was some speculation about restructuring related parsetree representations to reduce the need for parallel list chasing of this sort. Perhaps that's a win, or perhaps not, but in any case it would be considerably more invasive than this patch; and it's not particularly related to my immediate goal of improving the List infrastructure. So I'll leave that question for another day. Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Allow user control of CTE materialization, and change the default behavior.Tom Lane2019-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically we've always materialized the full output of a CTE query, treating WITH as an optimization fence (so that, for example, restrictions from the outer query cannot be pushed into it). This is appropriate when the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE query is non-recursive and side-effect-free, there's no hazard of changing the query results by pushing restrictions down. Another argument for materialization is that it can avoid duplicate computation of an expensive WITH query --- but that only applies if the WITH query is called more than once in the outer query. Even then it could still be a net loss, if each call has restrictions that would allow just a small part of the WITH query to be computed. Hence, let's change the behavior for WITH queries that are non-recursive and side-effect-free. By default, we will inline them into the outer query (removing the optimization fence) if they are called just once. If they are called more than once, we will keep the old behavior by default, but the user can override this and force inlining by specifying NOT MATERIALIZED. Lastly, the user can force the old behavior by specifying MATERIALIZED; this would mainly be useful when the query had deliberately been employing WITH as an optimization fence to prevent a poor choice of plan. Andreas Karlsson, Andrew Gierth, David Fetter Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
* Create the infrastructure for planner support functions.Tom Lane2019-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | Rename/repurpose pg_proc.protransform as "prosupport". The idea is still that it names an internal function that provides knowledge to the planner about the behavior of the function it's attached to; but redesign the API specification so that it's not limited to doing just one thing, but can support an extensible set of requests. The original purpose of simplifying a function call is handled by the first request type to be invented, SupportRequestSimplify. Adjust all the existing transform functions to handle this API, and rename them fron "xxx_transform" to "xxx_support" to reflect the potential generalization of what they do. (Since we never previously provided any way for extensions to add transform functions, this change doesn't create an API break for them.) Also add DDL and pg_dump support for attaching a support function to a user-defined function. Unfortunately, DDL access has to be restricted to superusers, at least for now; but seeing that support functions will pretty much have to be written in C, that limitation is just theoretical. (This support is untested in this patch, but a follow-on patch will add cases that exercise it.) Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
* Use EXECUTE FUNCTION syntax for triggers morePeter Eisentraut2019-02-07
| | | | | | | | | | | Change pg_dump and ruleutils.c to use the FUNCTION keyword instead of PROCEDURE in trigger and event trigger definitions. This completes the pieces of the transition started in 0a63f996e018ac508c858e87fa39cc254a5db49f that were kept out of PostgreSQL 11 because of the required catversion change. Discussion: https://www.postgresql.org/message-id/381bef53-f7be-29c8-d977-948e389161d6@2ndquadrant.com
* Renaming for new subscripting mechanismAlvaro Herrera2019-02-01
| | | | | | | | | | | | Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to introduce a more generic subscription mechanism, which allows subscripting not only arrays but also other object types such as JSONB. That functionality is introduced in a largish invasive patch, out of which this internal renaming patch was extracted. Author: Dmitry Dolgov Reviewed-by: Tom Lane, Arthur Zakirov Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* In the planner, replace an empty FROM clause with a dummy RTE.Tom Lane2019-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fact that "SELECT expression" has no base relations has long been a thorn in the side of the planner. It makes it hard to flatten a sub-query that looks like that, or is a trivial VALUES() item, because the planner generally uses relid sets to identify sub-relations, and such a sub-query would have an empty relid set if we flattened it. prepjointree.c contains some baroque logic that works around this in certain special cases --- but there is a much better answer. We can replace an empty FROM clause with a dummy RTE that acts like a table of one row and no columns, and then there are no such corner cases to worry about. Instead we need some logic to get rid of useless dummy RTEs, but that's simpler and covers more cases than what was there before. For really trivial cases, where the query is just "SELECT expression" and nothing else, there's a hazard that adding the extra RTE makes for a noticeable slowdown; even though it's not much processing, there's not that much for the planner to do overall. However testing says that the penalty is very small, close to the noise level. In more complex queries, this is able to find optimizations that we could not find before. The new RTE type is called RTE_RESULT, since the "scan" plan type it gives rise to is a Result node (the same plan we produced for a "SELECT expression" query before). To avoid confusion, rename the old ResultPath path type to GroupResultPath, reflecting that it's only used in degenerate grouping cases where we know the query produces just one grouped row. (It wouldn't work to unify the two cases, because there are different rules about where the associated quals live during query_planner.) Note: although this touches readfuncs.c, I don't think a catversion bump is required, because the added case can't occur in stored rules, only plans. Patch by me, reviewed by David Rowley and Mark Dilger Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
* Remove superfluous tqual.h includes.Andres Freund2019-01-21
| | | | | | | | | | | | Most of these had been obsoleted by 568d4138c / the SnapshotNow removal. This is is preparation for moving most of tqual.[ch] into either snapmgr.h or heapam.h, which in turn is in preparation for pluggable table AMs. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Replace uses of heap_open et al with the corresponding table_* function.Andres Freund2019-01-21
| | | | | Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Replace heapam.h includes with {table, relation}.h where applicable.Andres Freund2019-01-21
| | | | | | | | | A lot of files only included heapam.h for relation_open, heap_open etc - replace the heapam.h include in those files with the narrower header. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Don't include heapam.h from others headers.Andres Freund2019-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | heapam.h previously was included in a number of widely used headers (e.g. execnodes.h, indirectly in executor.h, ...). That's problematic on its own, as heapam.h contains a lot of low-level details that don't need to be exposed that widely, but becomes more problematic with the upcoming introduction of pluggable table storage - it seems inappropriate for heapam.h to be included that widely afterwards. heapam.h was largely only included in other headers to get the HeapScanDesc typedef (which was defined in heapam.h, even though HeapScanDescData is defined in relscan.h). The better solution here seems to be to just use the underlying struct (forward declared where necessary). Similar for BulkInsertState. Another problem was that LockTupleMode was used in executor.h - parts of the file tried to cope without heapam.h, but due to the fact that it indirectly included it, several subsequent violations of that goal were not not noticed. We could just reuse the approach of declaring parameters as int, but it seems nicer to move LockTupleMode to lockoptions.h - that's not a perfect location, but also doesn't seem bad. As a number of files relied on implicitly included heapam.h, a significant number of files grew an explicit include. It's quite probably that a few external projects will need to do the same. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
* Replace the data structure used for keyword lookup.Tom Lane2019-01-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, ScanKeywordLookup was passed an array of string pointers. This had some performance deficiencies: the strings themselves might be scattered all over the place depending on the compiler (and some quick checking shows that at least with gcc-on-Linux, they indeed weren't reliably close together). That led to very cache-unfriendly behavior as the binary search touched strings in many different pages. Also, depending on the platform, the string pointers might need to be adjusted at program start, so that they couldn't be simple constant data. And the ScanKeyword struct had been designed with an eye to 32-bit machines originally; on 64-bit it requires 16 bytes per keyword, making it even more cache-unfriendly. Redesign so that the keyword strings themselves are allocated consecutively (as part of one big char-string constant), thereby eliminating the touch-lots-of-unrelated-pages syndrome. And get rid of the ScanKeyword array in favor of three separate arrays: uint16 offsets into the keyword array, uint16 token codes, and uint8 keyword categories. That reduces the overhead per keyword to 5 bytes instead of 16 (even less in programs that only need one of the token codes and categories); moreover, the binary search only touches the offsets array, further reducing its cache footprint. This also lets us put the token codes somewhere else than the keyword strings are, which avoids some unpleasant build dependencies. While we're at it, wrap the data used by ScanKeywordLookup into a struct that can be treated as an opaque type by most callers. That doesn't change things much right now, but it will make it less painful to switch to a hash-based lookup method, as is being discussed in the mailing list thread. Most of the change here is associated with adding a generator script that can build the new data structure from the same list-of-PG_KEYWORD header representation we used before. The PG_KEYWORD lists that plpgsql and ecpg used to embed in their scanner .c files have to be moved into headers, and the Makefiles have to be taught to invoke the generator script. This work is also necessary if we're to consider hash-based lookup, since the generator script is what would be responsible for constructing a hash table. Aside from saving a few kilobytes in each program that includes the keyword table, this seems to speed up raw parsing (flex+bison) by a few percent. So it's worth doing even as it stands, though we think we can gain even more with a follow-on patch to switch to hash-based lookup. John Naylor, with further hacking by me Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Remove WITH OIDS support, change oid catalog column visibility.Andres Freund2018-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
* Add missing quote_identifier calls for CREATE TRIGGER ... REFERENCING.Tom Lane2018-10-19
| | | | | | | | | | | Mixed-case names for transition tables weren't dumped correctly. Oversight in commit 8c48375e5, per bug #15440 from Karl Czajkowski. In passing, I couldn't resist a bit of code beautification. Back-patch to v10 where this was introduced. Discussion: https://postgr.es/m/15440-02d1468e94d63d76@postgresql.org
* Create an RTE field to record the query's lock mode for each relation.Tom Lane2018-09-30
| | | | | | | | | | | | | | | | | | | | | | | | Add RangeTblEntry.rellockmode, which records the appropriate lock mode for each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock, or RowExclusiveLock depending on the RTE's role in the query). This patch creates the field and makes all creators of RTE nodes fill it in reasonably, but for the moment nothing much is done with it. The plan is to replace assorted post-parser logic that re-determines the right lockmode to use with simple uses of rte->rellockmode. For now, just add Asserts in each of those places that the rellockmode matches what they are computing today. (In some cases the match isn't perfect, so the Asserts are weaker than you might expect; but this seems OK, as per discussion.) This passes check-world for me, but it seems worth pushing in this state to see if the buildfarm finds any problems in cases I failed to test. catversion bump due to change of stored rules. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
* Fix parsetree representation of XMLTABLE(XMLNAMESPACES(DEFAULT ...)).Tom Lane2018-09-17
| | | | | | | | | | | | | | | | | | | | | | | The original coding for XMLTABLE thought it could represent a default namespace by a T_String Value node with a null string pointer. That's not okay, though; in particular outfuncs.c/readfuncs.c are not on board with such a representation, meaning you'll get a null pointer crash if you try to store a view or rule containing this construct. To fix, change the parsetree representation so that we have a NULL list element, instead of a bogus Value node. This isn't really a functional limitation since default XML namespaces aren't yet implemented in the executor; you'd just get "DEFAULT namespace is not supported" anyway. But crashes are not nice, so back-patch to v10 where this syntax was added. Ordinarily we'd consider a parsetree representation change to be un-backpatchable; but since existing releases would crash on the way to storing such constructs, there can't be any existing views/rules to be incompatible with. Per report from Andrey Lepikhov. Discussion: https://postgr.es/m/3690074f-abd2-56a9-144a-aa5545d7a291@postgrespro.ru
* Remove ruleutils.c's special case for BIT [VARYING] literals.Tom Lane2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, get_const_expr() insisted on prefixing BIT and VARBIT literals with 'B'. That's not really necessary, because we always append explicit-cast syntax to identify the constant's type. Moreover, it's subtly wrong for VARBIT, because the parser will interpret B'...' as '...'::"bit"; see make_const() which explicitly assigns type BITOID for a T_BitString literal. So what had been a simple VARBIT literal is reconstructed as ('...'::"bit")::varbit, which is not the same thing, at least not before constant folding. This results in odd differences after dump/restore, as complained of by the patch submitter, and it could result in actual failures in partitioning or inheritance DDL operations (see commit 542320c2b, which repaired similar misbehaviors for some other data types). Fixing it is pretty easy: just remove the special case and let the default code path handle these types. We could have kept the special case for BIT only, but there seems little point in that. Like the previous patch, I judge that back-patching this into stable branches wouldn't be a good idea. However, it seems not quite too late for v11, so let's fix it there. Paul Guo, reviewed by Davy Machado and John Naylor, minor adjustments by me Discussion: https://postgr.es/m/CABQrizdTra=2JEqA6+Ms1D1k1Kqw+aiBBhC9TreuZRX2JzxLAA@mail.gmail.com
* Remove pg_constraint.conincludingAlvaro Herrera2018-09-03
| | | | | | | | | | | | This column was added in commit 8224de4f42cc ("Indexes with INCLUDE columns and their support in B-tree") to ease writing the ruleutils.c supporting code for that feature, but it turns out to be unnecessary -- we can do the same thing with just one more syscache lookup. Even the documentation for the new column being removed in this commit is awkward. Discussion: https://postgr.es/m/20180902165018.33otxftp3olgtu4t@alvherre.pgsql
* Further fixes for quoted-list GUC values in pg_dump and ruleutils.c.Tom Lane2018-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits 742869946 et al turn out to be a couple bricks shy of a load. We were dumping the stored values of GUC_LIST_QUOTE variables as they appear in proconfig or setconfig catalog columns. However, although that quoting rule looks a lot like SQL-identifier double quotes, there are two critical differences: empty strings ("") are legal, and depending on which variable you're considering, values longer than NAMEDATALEN might be valid too. So the current technique fails altogether on empty-string list entries (as reported by Steven Winfield in bug #15248) and it also risks truncating file pathnames during dump/reload of GUC values that are lists of pathnames. To fix, split the stored value without any downcasing or truncation, and then emit each element as a SQL string literal. This is a tad annoying, because we now have three copies of the comma-separated-string splitting logic in varlena.c as well as a fourth one in dumputils.c. (Not to mention the randomly-different-from-those splitting logic in libpq...) I looked at unifying these, but it would be rather a mess unless we're willing to tweak the API definitions of SplitIdentifierString, SplitDirectoriesString, or both. That might be worth doing in future; but it seems pretty unsafe for a back-patched bug fix, so for now accept the duplication. Back-patch to all supported branches, as the previous fix was. Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
* Document security implications of qualified names.Noah Misch2018-07-28
| | | | | | | | | | | Commit 5770172cb0c9df9e6ce27c507b449557e5b45124 documented secure schema usage, and that advice suffices for using unqualified names securely. Document, in typeconv-func primarily, the additional issues that arise with qualified names. Back-patch to 9.3 (all supported versions). Reviewed by Jonathan S. Katz. Discussion: https://postgr.es/m/20180721012446.GA1840594@rfd.leadboat.com
* Fix pg_get_indexdef()'s behavior for included index columns.Tom Lane2018-07-19
| | | | | | | | | | | | | | | | | The multi-argument form of pg_get_indexdef() failed to print anything when asked to print a single index column that is an included column rather than a key column. This seems an unintentional result of someone having tried to take a short-cut and use the attrsOnly flag for two different purposes. To fix, split said flag into two flags, attrsOnly which suppresses non-attribute info, and keysOnly which suppresses included columns. Add a test case using psql's \d command, which relies on that function. (It's mighty tempting at this point to replace pg_get_indexdef_worker's mess of boolean flag arguments with a single bitmask-of-flags argument, which would allow making the call sites much more self-documenting. But I refrained for the moment.) Discussion: https://postgr.es/m/21724.1531943735@sss.pgh.pa.us
* Clean up warnings from -Wimplicit-fallthrough.Tom Lane2018-05-01
| | | | | | | | | | | | | | | | | | | | | | | | | Recent gcc can warn about switch-case fall throughs that are not explicitly labeled as intentional. This seems like a good thing, so clean up the warnings exposed thereby by labeling all such cases with comments that gcc will recognize. In files that already had one or more suitable comments, I generally matched the existing style of those. Otherwise I went with /* FALLTHROUGH */, which is one of the spellings approved at the more-restrictive-than-default level -Wimplicit-fallthrough=4. (At the default level you can also spell it /* FALL ?THRU */, and it's not picky about case. What you can't do is include additional text in the same comment, so some existing comments containing versions of this aren't good enough.) Testing with gcc 8.0.1 (Fedora 28's current version), I found that I also had to put explicit "break"s after elog(ERROR) or ereport(ERROR); apparently, for this purpose gcc doesn't recognize that those don't return. That seems like possibly a gcc bug, but it's fine because in most places we did that anyway; so this amounts to a visit from the style police. Discussion: https://postgr.es/m/15083.1525207729@sss.pgh.pa.us
* Reorganize partitioning codeAlvaro Herrera2018-04-14
| | | | | | | | | | | | | | | | | | | | | | There's been a massive addition of partitioning code in PostgreSQL 11, with little oversight on its placement, resulting in a catalog/partition.c with poorly defined boundaries and responsibilities. This commit tries to set a couple of distinct modules to separate things a little bit. There are no code changes here, only code movement. There are three new files: src/backend/utils/cache/partcache.c src/include/partitioning/partdefs.h src/include/utils/partcache.h The previous arrangement of #including catalog/partition.h almost everywhere is no more. Authors: Amit Langote and Álvaro Herrera Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
* Cleanup covering infrastructureTeodor Sigaev2018-04-12
| | | | | | | | | | | - Explicitly forbids opclass, collation and indoptions (like DESC/ASC etc) for including columns. Throw an error if user points that. - Truncated storage arrays for such attributes to store only key atrributes, added assertion checks. - Do not check opfamily and collation for including columns in CompareIndexInfo() Discussion: https://www.postgresql.org/message-id/5ee72852-3c4e-ee35-e2ed-c1d053d45c08@sigaev.ru
* Indexes with INCLUDE columns and their support in B-treeTeodor Sigaev2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
* Fast ALTER TABLE ADD COLUMN with a non-NULL defaultAndrew Dunstan2018-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently adding a column to a table with a non-NULL default results in a rewrite of the table. For large tables this can be both expensive and disruptive. This patch removes the need for the rewrite as long as the default value is not volatile. The default expression is evaluated at the time of the ALTER TABLE and the result stored in a new column (attmissingval) in pg_attribute, and a new column (atthasmissing) is set to true. Any existing row when fetched will be supplied with the attmissingval. New rows will have the supplied value or the default and so will never need the attmissingval. Any time the table is rewritten all the atthasmissing and attmissingval settings for the attributes are cleared, as they are no longer needed. The most visible code change from this is in heap_attisnull, which acquires a third TupleDesc argument, allowing it to detect a missing value if there is one. In many cases where it is known that there will not be any (e.g. catalog relations) NULL can be passed for this argument. Andrew Dunstan, heavily modified from an original patch from Serge Rielau. Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley. Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
* Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c.Tom Lane2018-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz
* Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY.Tom Lane2018-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us
* Add prokind column, replacing proisagg and proiswindowPeter Eisentraut2018-03-02
| | | | | | | | | | The new column distinguishes normal functions, procedures, aggregates, and window functions. This replaces the existing columns proisagg and proiswindow, and replaces the convention that procedures are indicated by prorettype == 0. Also change prorettype to be VOIDOID for procedures. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz>
* Avoid using unsafe search_path settings during dump and restore.Tom Lane2018-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | Historically, pg_dump has "set search_path = foo, pg_catalog" when dumping an object in schema "foo", and has also caused that setting to be used while restoring the object. This is problematic because functions and operators in schema "foo" could capture references meant to refer to pg_catalog entries, both in the queries issued by pg_dump and those issued during the subsequent restore run. That could result in dump/restore misbehavior, or in privilege escalation if a nefarious user installs trojan-horse functions or operators. This patch changes pg_dump so that it does not change the search_path dynamically. The emitted restore script sets the search_path to what was used at dump time, and then leaves it alone thereafter. Created objects are placed in the correct schema, regardless of the active search_path, by dint of schema-qualifying their names in the CREATE commands, as well as in subsequent ALTER and ALTER-like commands. Since this change requires a change in the behavior of pg_restore when processing an archive file made according to this new convention, bump the archive file version number; old versions of pg_restore will therefore refuse to process files made with new versions of pg_dump. Security: CVE-2018-1058
* Add procedure support to pg_get_functiondefPeter Eisentraut2018-02-13
| | | | | | This also makes procedures work in psql's \ef and \sf commands. Reported-by: Pavel Stehule <pavel.stehule@gmail.com>