aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Refactor how VACUUM passes around its XID cutoffs.Peter Geoghegan2022-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use a dedicated struct for the XID/MXID cutoffs used by VACUUM, such as FreezeLimit and OldestXmin. This state is initialized in vacuum.c, and then passed around by code from vacuumlazy.c to heapam.c freezing related routines. The new convention is that everybody works off of the same cutoff state, which is passed around via pointers to const. Also simplify some of the logic for dealing with frozen xmin in heap_prepare_freeze_tuple: add dedicated "xmin_already_frozen" state to clearly distinguish xmin XIDs that we're going to freeze from those that were already frozen from before. That way the routine's xmin handling code is symmetrical with the existing xmax handling code. This is preparation for an upcoming commit that will add page level freezing. Also refactor the control flow within FreezeMultiXactId(), while adding stricter sanity checks. We now test OldestXmin directly, instead of using FreezeLimit as an inexact proxy for OldestXmin. This is further preparation for the page level freezing work, which will make the function's caller cede control of page level freezing to the function where appropriate (where heap_prepare_freeze_tuple sees a tuple that happens to contain a MultiXactId in its xmax). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CAH2-WznS9TxXmz2_=SY+SyJyDFbiOftKofM9=aDo68BbXNBUMA@mail.gmail.com
* Avoid O(N^2) cost when pulling up lots of UNION ALL subqueries.Tom Lane2022-12-22
| | | | | | | | | | | | | | | | | | | perform_pullup_replace_vars() knows how to scan the whole parent query tree when we are replacing Vars during a subquery flattening operation. However, for the specific case of flattening a UNION ALL leaf query, that's mostly wasted work: the only place where relevant Vars could exist is in the AppendRelInfo that we just made for this leaf. Teaching perform_pullup_replace_vars() to just deal with that and exit is worthwhile because, if we have N such subqueries to pull up, we were spending O(N^2) work uselessly mutating the AppendRelInfos for all the other subqueries. While we're at it, avoid calling substitute_phv_relids if there are no PlaceHolderVars, and remove an obsolete check of parse->hasSubLinks. Andrey Lepikhov and Tom Lane Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
* Add some recursion and looping defenses in prepjointree.c.Tom Lane2022-12-22
| | | | | | | | | | | | | Andrey Lepikhov demonstrated a case where we spend an unreasonable amount of time in pull_up_subqueries(). Not only is that recursing with no explicit check for stack overrun, but the code seems not interruptable by control-C. Let's stick a CHECK_FOR_INTERRUPTS there, along with sprinkling some stack depth checks. An actual fix for the excessive time consumption seems a bit risky to back-patch; but this isn't, so let's do so. Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
* Remove dead codePeter Eisentraut2022-12-22
| | | | | The second appearance of NamespaceRelationId in this if-else chain is in error and can be removed.
* Fix operator typo in tablecmds.cMichael Paquier2022-12-22
| | | | | | | | | | | | | | A bitwise operator was getting used on two bools in ATAddCheckConstraint() to track if constraints should be merged or not with the existing ones of a relation, though obviously this should use a boolean OR operator. This led to the same result, but let's be clean. Oversight in 074c5cf. Author: Ranier Vilela Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAEudQAp2R2fbbi0OHHhv_n4=Ch0t1VtjObR9YMqtGKHJ+faUFQ@mail.gmail.com
* Add palloc_aligned() to allow aligned memory allocationsDavid Rowley2022-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces palloc_aligned() and MemoryContextAllocAligned() which allow callers to obtain memory which is allocated to the given size and also aligned to the specified alignment boundary. The alignment boundaries may be any power-of-2 value. Currently, the alignment is capped at 2^26, however, we don't expect values anything like that large. The primary expected use case is to align allocations to perhaps CPU cache line size or to maybe I/O page size. Certain use cases can benefit from having aligned memory by either having better performance or more predictable performance. The alignment is achieved by requesting 'alignto' additional bytes from the underlying allocator function and then aligning the address that is returned to the requested alignment. This obviously does waste some memory, so alignments should be kept as small as what is required. It's also important to note that these alignment bytes eat into the maximum allocation size. So something like: palloc_aligned(MaxAllocSize, 64, 0); will not work as we cannot request MaxAllocSize + 64 bytes. Additionally, because we're just requesting the requested size plus the alignment requirements from the given MemoryContext, if that context is the Slab allocator, then since slab can only provide chunks of the size that's specified when the slab context is created, then this is not going to work. Slab will generate an error to indicate that the requested size is not supported. The alignment that is requested in palloc_aligned() is stored along with the allocated memory. This allows the alignment to remain intact through repalloc() calls. Author: Andres Freund, David Rowley Reviewed-by: Maxim Orlov, Andres Freund, John Naylor Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com
* Introduce float4in_internalAndrew Dunstan2022-12-21
| | | | | | | | | | | | | This is the guts of float4in, callable as a routine to input floats, which will be useful in an upcoming patch for allowing soft errors in the seg module's input function. A similar operation was performed some years ago for float8in in commit 50861cd683e. Reviewed by Tom Lane Discussion: https://postgr.es/m/cee4e426-d014-c0b7-aa22-a659f2cd9130@dunslane.net
* Fix newly introduced bug in slab.cDavid Rowley2022-12-22
| | | | | | | | | | | | | | | | | | | | d21ded75f changed the way slab.c works but introduced a bug that meant we could end up with the slab's curBlocklistIndex pointing to the wrong list. The condition which was checking for this was failing to account for two things: 1. The curBlocklistIndex could be 0 as we've currently got no non-full blocks to put chunks on. In this case, the dlist_is_empty() check cannot be performed as there can be any number of completely full blocks at that index. 2. The curBlocklistIndex may be greater than the index we just moved the block onto. Since we need to ensure we fill up fuller blocks first, we must reset curBlocklistIndex when changing any blocklist element that's less than the curBlocklistIndex too. Reported-by: Takamichi Osumi Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
* Switch some system functions to use get_call_result_type()Michael Paquier2022-12-21
| | | | | | | | | | | | | | | | | This shaves some code by replacing the combinations of CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of the attributes listed in pg_proc.dat by get_call_result_type() to build the TupleDesc needed for the rows generated. get_call_result_type() is more expensive than the former style, but this removes some duplication with the lists of OUT parameters (pg_proc.dat and the attributes hardcoded in these code paths). This is applied to functions that are not considered as critical (aka that could be called repeatedly for monitoring purposes). Author: Bharath Rupireddy Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com
* Add copyright notices to meson filesAndrew Dunstan2022-12-20
| | | | Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
* Allow batching of inserts during cross-partition updates.Etsuro Fujita2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 927f453a9 disallowed batching added by commit b663a4136 to be used for the inserts performed as part of cross-partition updates of partitioned tables, mainly because the previous code in nodeModifyTable.c couldn't handle pending inserts into foreign-table partitions that are also UPDATE target partitions. But we don't have such a limitation anymore (cf. commit ffbb7e65a), so let's allow for this by removing from execPartition.c the restriction added by commit 927f453a9 that batching is only allowed if the query command type is CMD_INSERT. In postgres_fdw, since commit 86dc90056 changed it to effectively disable cross-partition updates in the case where a foreign-table partition chosen to insert rows into is also an UPDATE target partition, allow batching in the case where a foreign-table partition chosen to do so is *not* also an UPDATE target partition. This is enabled by the "batch_size" option added by commit b663a4136, which is disabled by default. This patch also adjusts the test case added by commit 927f453a9 to confirm that the inserts performed as part of a cross-partition update of a partitioned table indeed uses batching. Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu, Bharath Rupireddy, Hou Zhijie, Vignesh C, and me. Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
* Add enable_presorted_aggregate GUCDavid Rowley2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1349d279 added query planner support to allow more efficient execution of aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to that commit, the planner would only request that the lower planner produce a plan with the order required for the GROUP BY clause and it would be left up to nodeAgg.c to perform the final sort of records within each group so that the aggregate transition functions were called in the correct order. Now that the planner requests the lower planner produce a plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind, there is the possibility that the planner chooses a plan which could be less efficient than what would have been produced before 1349d279. While developing 1349d279, I had in mind that Incremental Sort would help us in cases where an index exists only on the GROUP BY column(s). Incremental Sort would just replace the implicit tuplesorts which are being performed in nodeAgg.c. However, because the planner has the flexibility to instead choose a plan which just performs a full sort on both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is potential for the planner to make a bad choice. The costing for Incremental Sort is not perfect as it assumes an even distribution of rows to sort within each sort group. Here we add an escape hatch in the form of the enable_presorted_aggregate GUC. This will allow users to get the pre-PG16 behavior in cases where they have no other means to convince the query planner to produce a plan which only sorts on the GROUP BY column(s). Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
* Improve the performance of the slab memory allocatorDavid Rowley2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Slab has traditionally been fairly slow when compared with the AllocSet or Generation memory allocators. Part of this slowness came from having to write out an entire block when we allocate a new block in order to populate the free list indexes within the block's memory. Additional slowness came from having to move a block onto another dlist each time we palloc or pfree a chunk from it. Here we optimize both of those cases and do a little bit extra to improve the performance of the slab allocator. Here, instead of writing out the free list indexes when allocating a new block, we introduce the concept of "unused" chunks. When a block is first allocated all chunks are unused. These chunks only make it onto the free list when they are pfree'd. When allocating new chunks on an existing block, we have the choice of consuming a chunk from the free list or an unused chunk. When both exist, we opt to use one from the free list, as these have been used already and the memory of them is more likely to be cached by the CPU. Here we also reduce the number of block lists from there being one for every possible value of free chunks on a block to just having a small fixed number of block lists. We keep the 0th block list for completely full blocks and anything else stores blocks for some range of free chunks with fuller blocks appearing on lower block list array elements. This reduces how often we must move a block to another list when we allocate or free chunks, but still allows us to prefer to put new chunks on fuller blocks and perhaps allow blocks with fewer chunks to be free'd later once all their remaining chunks have been pfree'd. Additionally, we now store a list of "emptyblocks", which are blocks that no longer contain any allocated chunks. We now keep up to 10 of these around to avoid having to thrash malloc/free when allocation patterns continually cause blocks to become free of any allocated chunks only to allocate more chunks again. Now only once we have 10 of these, we free the block. This does raise the high water mark for the total memory that a slab context can consume. It does not seem entirely unreasonable that we might one day want to make this a property of SlabContext rather than a compile-time constant. Let's wait and see if there is any evidence to support that this is required before doing it. Author: Andres Freund, David Rowley Tested-by: Tomas Vondra, John Naylor Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
* Move variable increment to the end of the loopJohn Naylor2022-12-20
| | | | | | | | | | This is less error prone and matches the placement of other code in the file. Justin Pryzby Reviewed by Tom Lane Discussion: https://www.postgresql.org/message-id/20221123172436.GJ11463@telsasoft.com
* Add pg_dissect_walfile_name()Michael Paquier2022-12-20
| | | | | | | | | | | | | | | | | | | This function takes in input a WAL segment name and returns a tuple made of the segment sequence number (dependent on the WAL segment size of the cluster) and its timeline, as of a thin SQL wrapper around the existing XLogFromFileName(). This function has multiple usages, like being able to compile a LSN from a file name and an offset, or finding the timeline of a segment without having to do to some maths based on the first eight characters of the segment. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael Paquier Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
* Remove hardcoded dependency to cryptohash type in the internals of SCRAMMichael Paquier2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCRAM_KEY_LEN was a variable used in the internal routines of SCRAM to size a set of fixed-sized arrays used in the SHA and HMAC computations during the SASL exchange or when building a SCRAM password. This had a hard dependency on SHA-256, reducing the flexibility of SCRAM when it comes to the addition of more hash methods. A second issue was that SHA-256 is assumed as the cryptohash method to use all the time. This commit renames SCRAM_KEY_LEN to a more generic SCRAM_KEY_MAX_LEN, which is used as the size of the buffers used by the internal routines of SCRAM. This is aimed at tracking centrally the maximum size necessary for all the hash methods supported by SCRAM. A global variable has the advantage of keeping the code in its simplest form, reducing the need of more alloc/free logic for all the buffers used in the hash calculations. A second change is that the key length (SHA digest length) and hash types are now tracked by the state data in the backend and the frontend, the common portions being extended to handle these as arguments by the internal routines of SCRAM. There are a few RFC proposals floating around to extend the SCRAM protocol, including some to use stronger cryptohash algorithms, so this lifts some of the existing restrictions in the code. The code in charge of parsing and building SCRAM secrets is extended to rely on the key length and on the cryptohash type used for the exchange, assuming currently that only SHA-256 is supported for the moment. Note that the mock authentication simply enforces SHA-256. Author: Michael Paquier Reviewed-by: Peter Eisentraut, Jonathan Katz Discussion: https://postgr.es/m/Y5k3Qiweo/1g9CG6@paquier.xyz
* Expose some information about backend subxact status.Robert Haas2022-12-19
| | | | | | | | | | | | | | | A new function pg_stat_get_backend_subxact() can be used to get information about the number of subtransactions in the cache of a particular backend and whether that cache has overflowed. This can be useful for tracking down performance problems that can result from overflowed snapshots. Dilip Kumar, reviewed by Zhihong Yu, Nikolay Samokhvalov, Justin Pryzby, Nathan Bossart, Ashutosh Sharma, Julien Rouhaud. Additional design comments from Andres Freund, Tom Lane, Bruce Momjian, and David G. Johnston. Discussion: http://postgr.es/m/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN=Q@mail.gmail.com
* Fix inability to reference CYCLE column from inside its CTE.Tom Lane2022-12-16
| | | | | | | | | | | | | | | Such references failed with "cache lookup failed for type 0" because we didn't resolve the type of the CYCLE column until after analyzing the CTE's query. We can just move that processing to before the recursive parse_sub_analyze call, though. While here, invent a couple of local variables to make this code less egregiously wider-than-80-columns. Per bug #17723 from Vik Fearing. Back-patch to v14 where the CYCLE feature was added. Discussion: https://postgr.es/m/17723-2c4985ff111e7bba@postgresql.org
* Clean up dubious error handling in wellformed_xml().Tom Lane2022-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | This ancient bit of code was summarily trapping any ereport longjmp whatsoever and assuming that it must represent an invalid-XML report. It's not really appropriate to handle OOM-like situations that way: maybe the input is valid or maybe not, but we couldn't find out. And it'd be a seriously bad idea to ignore, say, a query cancel error that way. (Perhaps that can't happen because there is no CHECK_FOR_INTERRUPTS anywhere within xml_parse, but even if that's true today it's obviously a very fragile assumption.) But in the wake of the previous commit, we can drop the PG_TRY here altogether, and use the soft error mechanism to catch only the kinds of errors that are legitimate to treat as invalid-XML. (This is our first use of the soft error mechanism for something not directly related to a datatype input function. It won't be the last.) xml_is_document can be converted in the same way. That one is not actively broken, because it was checking specifically for ERRCODE_INVALID_XML_DOCUMENT rather than trapping everything; but the code is still shorter and probably faster this way. Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us
* Convert xml_in to report errors softly.Tom Lane2022-12-16
| | | | | | | | | | | | | | | | The key idea here is that xml_parse must distinguish hard errors from soft errors. We want to throw a hard error for libxml initialization failures: those might be out-of-memory, or something else, but in any case they are not the fault of the input string. If we get to the point of parsing the input, and something goes wrong, we can fairly consider that to mean bad input. One thing that arguably does mean bad input, but I didn't trouble to handle softly, is encoding conversion failure while converting the server encoding to UTF8. This might be something to improve later, but it seems like a pretty low-probability scenario. Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us
* Remove pessimistic cost penalization from Incremental SortDavid Rowley2022-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When incremental sorts were added in v13 a 1.5x pessimism factor was added to the cost modal. Seemingly this was done because the cost modal only has an estimate of the total number of input rows and the number of presorted groups. It assumes that the input rows will be evenly distributed throughout the presorted groups. The 1.5x pessimism factor was added to slightly reduce the likelihood of incremental sorts being used in the hope to avoid performance regressions where an incremental sort plan was picked and turned out slower due to a large skew in the number of rows in the presorted groups. An additional quirk with the path generation code meant that we could consider both a sort and an incremental sort on paths with presorted keys. This meant that with the pessimism factor, it was possible that we opted to perform a sort rather than an incremental sort when the given path had presorted keys. Here we remove the 1.5x pessimism factor to allow incremental sorts to have a fairer chance at being chosen against a full sort. Previously we would generally create a sort path on the cheapest input path (if that wasn't sorted already) and incremental sort paths on any path which had presorted keys. This meant that if the cheapest input path wasn't completely sorted but happened to have presorted keys, we would create a full sort path *and* an incremental sort path on that input path. Here we change this logic so that if there are presorted keys, we only create an incremental sort path, and create sort paths only when a full sort is required. Both the removal of the cost pessimism factor and the changes made to the path generation make it more likely that incremental sorts will now be chosen. That, of course, as with teaching the planner any new tricks, means an increased likelihood that the planner will perform an incremental sort when it's not the best method. Our standard escape hatch for these cases is an enable_* GUC. enable_incremental_sort already exists for this. This came out of a report by Pavel Luzanov where he mentioned that the master branch was choosing to perform a Seq Scan -> Sort -> Group Aggregate for his query with an ORDER BY aggregate function. The v15 plan for his query performed an Index Scan -> Group Aggregate, of course, the aggregate performed the final sort internally in nodeAgg.c for the aggregate's ORDER BY. The ideal plan would have been to use the index, which provided partially sorted input then use an incremental sort to provide the aggregate with the sorted input. This was not being chosen due to the pessimism in the incremental sort cost modal, so here we remove that and rationalize the path generation so that sort and incremental sort plans don't have to needlessly compete. We assume that it's senseless to ever use a full sort on a given input path where an incremental sort can be performed. Reported-by: Pavel Luzanov Reviewed-by: Richard Guo Discussion: https://postgr.es/m/9f61ddbf-2989-1536-b31e-6459370a6baa%40postgrespro.ru
* Speed up creation of command completion tagsDavid Rowley2022-12-16
| | | | | | | | | | | | | | | | | | | The building of command completion tags could often be seen showing up in profiles when running high tps workloads. The query completion tags were being built with snprintf, which is slow at the best of times when compared with more manual ways of formatting strings. Here we introduce BuildQueryCompletionString() to do this job for us. We also now store the completion tag's strlen in the CommandTagBehavior struct so that we can quickly memcpy this number of bytes into the completion tag string. Appending the rows affected is done via pg_ulltoa_n. BuildQueryCompletionString returns the length of the built string. This saves us having to call strlen to figure out how many bytes to pass to pq_putmessage(). Author: David Rowley, Andres Freund Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAHoyFK-Xwqc-iY52shj0G+8K9FJpse+FuZ36XBKy78wDVnd=Qg@mail.gmail.com
* Convert range_in and multirange_in to report errors softly.Tom Lane2022-12-15
| | | | | | | | | | | | | | | | This is mostly straightforward, except that if the range type has a canonical function, that might throw an error during range input. (Such errors probably only occur for edge cases: in the in-core canonical functions, it happens only if a bound has the maximum valid value for the underlying type.) Hence, this patch extends the soft-error regime to allow canonical functions to return errors softly as well. Extensions implementing range canonical functions will need modification anyway because of the API change for range_serialize(); while at it, they might want to do something similar to what's been done here in the in-core canonical functions. Discussion: https://postgr.es/m/3284599.1671075185@sss.pgh.pa.us
* Static assertions cleanupPeter Eisentraut2022-12-15
| | | | | | | | | | | | | | | | | | | | | Because we added StaticAssertStmt() first before StaticAssertDecl(), some uses as well as the instructions in c.h are now a bit backwards from the "native" way static assertions are meant to be used in C. This updates the guidance and moves some static assertions to better places. Specifically, since the addition of StaticAssertDecl(), we can put static assertions at the file level. This moves a number of static assertions out of function bodies, where they might have been stuck out of necessity, to perhaps better places at the file level or in header files. Also, when the static assertion appears in a position where a declaration is allowed, then using StaticAssertDecl() is more native than StaticAssertStmt(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com
* Convert a few more datatype input functions to report errors softly.Tom Lane2022-12-14
| | | | | | | Convert the remaining string-category input functions (bpcharin, varcharin, byteain) to the new style. Discussion: https://postgr.es/m/3038346.1671060258@sss.pgh.pa.us
* Convert a few more datatype input functions to report errors softly.Tom Lane2022-12-14
| | | | | | | | Convert cash_in and uuid_in to the new style. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
* Convert a few more datatype input functions to report errors softly.Tom Lane2022-12-14
| | | | | | | | | | | | | | | Convert assorted internal-ish datatypes, namely aclitemin, int2vectorin, oidin, oidvectorin, pg_lsn_in, pg_snapshot_in, and tidin to the new style. (Some others you might expect to find in this group, such as cidin and xidin, need no changes because they never throw errors at all. That seems a little cheesy ... but it is not in the charter of this patch series to add new error conditions.) Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
* Convert the geometric input functions to report errors softly.Tom Lane2022-12-14
| | | | | | | | | | | | | | | | | | Convert box_in, circle_in, line_in, lseg_in, path_in, point_in, and poly_in to the new style. line_in still throws hard errors for overflows/underflows that can occur when the input is specified as two points rather than in the canonical "Ax + By + C = 0" style. I'm not too concerned about that: it won't be reached in normal dump/restore cases, and it's fairly debatable that such conversion should ever have been made part of a type input function in the first place. But in any case, I don't want to extend the soft error conventions into float.h without more discussion than this patch has had. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
* Convert a few more datatype input functions to report errors softly.Tom Lane2022-12-14
| | | | | | | | | Convert bit_in, varbit_in, inet_in, cidr_in, macaddr_in, and macaddr8_in to the new style. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
* Rearrange some static assertions for consistencyPeter Eisentraut2022-12-14
| | | | | | | Put lengthof first. Reported-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/CAHut+PsUDMySVRuRc=h+P5N3+=TGvj4W_mi32XXg9dt4o-BXbA@mail.gmail.com
* Non-decimal integer literalsPeter Eisentraut2022-12-14
| | | | | | | | | | | | | | | | | | | Add support for hexadecimal, octal, and binary integer literals: 0x42F 0o273 0b100101 per SQL:202x draft. This adds support in the lexer as well as in the integer type input functions. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com
* Add grantable MAINTAIN privilege and pg_maintain role.Jeff Davis2022-12-13
| | | | | | | | | | | | | Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER, and LOCK TABLE. Effectively reverts 4441fc704d. Instead of creating separate privileges for VACUUM, ANALYZE, and other maintenance commands, group them together under a single MAINTAIN privilege. Author: Nathan Bossart Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13 Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us
* Rethink handling of [Prevent|Is]InTransactionBlock in pipeline mode.Tom Lane2022-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits f92944137 et al. made IsInTransactionBlock() set the XACT_FLAGS_NEEDIMMEDIATECOMMIT flag before returning "false", on the grounds that that kept its API promises equivalent to those of PreventInTransactionBlock(). This turns out to be a bad idea though, because it allows an ANALYZE in a pipelined series of commands to cause an immediate commit, which is unexpected. Furthermore, if we return "false" then we have another issue, which is that ANALYZE will decide it's allowed to do internal commit-and-start-transaction sequences, thus possibly unexpectedly committing the effects of previous commands in the pipeline. To fix the latter situation, invent another transaction state flag XACT_FLAGS_PIPELINING, which explicitly records the fact that we have executed some extended-protocol command and not yet seen a commit for it. Then, require that flag to not be set before allowing InTransactionBlock() to return "false". Having done that, we can remove its setting of NEEDIMMEDIATECOMMIT without fear of causing problems. This means that the API guarantees of IsInTransactionBlock now diverge from PreventInTransactionBlock, which is mildly annoying, but it seems OK given the very limited usage of IsInTransactionBlock. (In any case, a caller preferring the old behavior could always set NEEDIMMEDIATECOMMIT for itself.) For consistency also require XACT_FLAGS_PIPELINING to not be set in PreventInTransactionBlock. This too is meant to prevent commands such as CREATE DATABASE from silently committing previous commands in a pipeline. Per report from Peter Eisentraut. As before, back-patch to all supported branches (which sadly no longer includes v10). Discussion: https://postgr.es/m/65a899dd-aebc-f667-1d0a-abb89ff3abf8@enterprisedb.com
* Refactor ExecGrant_*() functionsPeter Eisentraut2022-12-13
| | | | | | | | | | | | Instead of half a dozen of mostly-duplicate ExecGrant_Foo() functions, write one common function ExecGrant_generic() that can handle most of them. We already have all the information we need, such as which system catalog corresponds to which catalog table and which column is the ACL column. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://www.postgresql.org/message-id/flat/22c7e802-4e7d-8d87-8b71-cba95e6f4bcf%40enterprisedb.com
* Fix jsonb subscripting to cope with toasted subscript values.Tom Lane2022-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | jsonb_get_element() was incautious enough to use VARDATA() and VARSIZE() directly on an arbitrary text Datum. That of course fails if the Datum is short-header, compressed, or out-of-line. The typical result would be failing to match any element of a jsonb object, though matching the wrong one seems possible as well. setPathObject() was slightly brighter, in that it used VARDATA_ANY and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for short-header Datums. push_path() had the same issue. This could result in faulty subscripted insertions, though keys long enough to cause a problem are likely rare in the wild. Having seen these, I looked around for unsafe usages in the rest of the adt/json* files. There are a couple of places where it's not immediately obvious that the Datum can't be compressed or out-of-line, so I added pg_detoast_datum_packed() to cope if it is. Also, remove some other usages of VARDATA/VARSIZE on Datums we just extracted from a text array. Those aren't actively broken, but they will become so if we ever start allowing short-header array elements, which does not seem like a terribly unreasonable thing to do. In any case they are not great coding examples, and they could also do with comments pointing out that we're assuming we don't need pg_detoast_datum_packed. Per report from exe-dealer@yandex.ru. Patch by me, but thanks to David Johnston for initial investigation. Back-patch to v14 where jsonb subscripting was introduced. Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru
* Fix failure to advance content pointer in sendFileWithContent.Robert Haas2022-12-12
| | | | | | | | | | | | | | | | | | | If sendFileWithContent were used to send a file larger than the bbsink buffer size, this would result in corruption. The only files that are sent via sendFileWithContent are the backup label file, the tablespace map file, and .done files for WAL segments included in the backup. Of these, it seems that only the tablespace_map file can become large enough to cause a problem, and then only if you have a lot of tablespaces. If you do have that situation, you might end up with a corrupted tablespace_map file, which would be bad. My commit bef47ff85df18bf4a3a9b13bd2a54820e27f3614 introduced this problem. Report and patch by Antonin Houska. Discussion: http://postgr.es/m/15764.1670528645@antos
* Order getopt argumentsPeter Eisentraut2022-12-12
| | | | | | | | | | Order the letters in the arguments of getopt() and getopt_long(), as well as in the subsequent switch statements. In most cases, I used alphabetical with lower case first. In a few cases, existing different orders (e.g., upper case first) was kept to reduce the diff size. Discussion: https://www.postgresql.org/message-id/flat/3efd0fe8-351b-f836-9122-886002602357%40enterprisedb.com
* Get rid of recursion-marker values in enum AlterTableTypeAlvaro Herrera2022-12-12
| | | | | | | | | | | | | | | During ALTER TABLE execution, when prep-time handling of subcommands of certain types determine that execution-time handling requires recursion, they signal this by changing the subcommand type to a special value. This can be done in a simpler way by using a separate flag introduced by commit ec0925c22a3d, so do that. Catversion bumped. It's not clear to me that ALTER TABLE subcommands are stored anywhere in catalogs (CREATE FUNCTION rejects it in BEGIN ATOMIC function bodies), but we do have both write and read support for them, so be safe. Discussion: https://postgr.es/m/20220929090033.zxuaezcdwh2fgfjb@alvherre.pgsql
* Remove direct call to GetNewObjectId() for pg_auth_members.oidMichael Paquier2022-12-12
| | | | | | | | | This routine should not be called directly as mentioned at its top, so replace it by GetNewOidWithIndex(). Issue introduced by 6566133 when pg_auth_members.oid got added, so no backpatch is needed. Author: Maciek Sakrejda Discussion: https://postgr.es/m/CAOtHd0Ckbih7Ur7XeVyLAJ26VZOfTNcq9qV403bNF4uTGtAN+Q@mail.gmail.com
* Convert domain_in to report errors softly.Tom Lane2022-12-11
| | | | | | | | | | | | | This is straightforward as far as it goes. However, it does not attempt to trap errors occurring during the execution of domain CHECK constraints. Since those are general user-defined expressions, the only way to do that would involve starting up a subtransaction for each check. Of course the entire point of the soft-errors feature is to not need subtransactions, so that would be self-defeating. For now, we'll rely on the assumption that domain checks are written to avoid throwing errors. Discussion: https://postgr.es/m/1181028.1670635727@sss.pgh.pa.us
* Convert json_in and jsonb_in to report errors softly.Tom Lane2022-12-11
| | | | | | | | | | | | | | | | This requires a bit of further infrastructure-extension to allow trapping errors reported by numeric_in and pg_unicode_to_server, but otherwise it's pretty straightforward. In the case of jsonb_in, we are only capturing errors reported during the initial "parse" phase. The value-construction phase (JsonbValueToJsonb) can also throw errors if assorted implementation limits are exceeded. We should improve that, but it seems like a separable project. Andrew Dunstan and Tom Lane Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net
* Change JsonSemAction to allow non-throw error reporting.Tom Lane2022-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, semantic action functions for the JSON parser returned void, so that there was no way for them to affect the parser's behavior. That means in particular that they can't force an error exit except by longjmp'ing. That won't do in the context of our project to make input functions return errors softly. Hence, change them to return the same JsonParseErrorType enum value as the parser itself uses. If an action function returns anything besides JSON_SUCCESS, the parse is abandoned and that error code is returned. Action functions can thus easily return the same error conditions that the parser already knows about. As an escape hatch for expansion, also invent a code JSON_SEM_ACTION_FAILED that the core parser does not know the exact meaning of. When returning this code, an action function must use some out-of-band mechanism for reporting the error details. This commit simply makes the API change and causes all the existing action functions to return JSON_SUCCESS, so that there is no actual change in behavior here. This is long enough and boring enough that it seemed best to commit it separately from the changes that make real use of the new mechanism. In passing, remove a duplicate assignment of transform_string_values_scalar. Discussion: https://postgr.es/m/1436686.1670701118@sss.pgh.pa.us
* Standardize error reports in unimplemented I/O functions.Tom Lane2022-12-10
| | | | | | | | | | | | | | | | We chose a specific wording of the not-implemented errors for pseudotype I/O functions and other cases where there's little value in implementing input and/or output. gtsvectorin never got that memo though, nor did most of contrib. Make these all fall in line, mostly because I'm a neatnik but also to remove unnecessary translatable strings. gbtreekey_in needs a bit of extra love since it supports multiple SQL types. Sadly, gbtreekey_out doesn't have the ability to do that, but I think it's unreachable anyway. Noted while surveying datatype input functions to see what we have left to fix.
* Use the macro, not handwritten code, to construct anymultirange_in().Tom Lane2022-12-10
| | | | | | | | | | | | | | Apparently anymultirange_in was written before we converted all these pseudotype input functions to use a common macro, and it didn't get fixed before committing. Sloppy merging probably explains its unintuitive ordering, too, so rearrange. Noted while surveying datatype input functions to see what we have left to fix. I'm inclined to leave the pseudotypes as throwing hard errors, because it's difficult to see a reason why anyone would need something else. But in any case, if we want to change that, we shouldn't have to change multiple copies of the code.
* Add subquery pullup handling for WindowClause runConditionDavid Rowley2022-12-10
| | | | | | | | | | | | | | | | | | | | | | | 9d9c02ccd added code to allow WindowAgg to take some shortcuts when a monotonic WindowFunc reached some value that it could never come back from due to the function's monotonic nature. That commit added a runCondition field to WindowClause to store the condition which, when it becomes false we can start taking shortcuts in nodeWindowAgg.c. Here we fix an issue where subquery pullups didn't properly update the runCondition to update the Vars to properly reference the new query level. Here we also add a missing call to preprocess_expression() for the WindowClause's runCondtion. The WindowFuncs in the targetlist will have had this process done, so we must also do it for the WindowFuncs in the runCondition so that they can be correctly found in the targetlist during setrefs.c Bug: #17709 Reported-by: Alexey Makhmutov Author: Richard Guo, David Rowley Discussion: https://postgr.es/m/17709-4f557160e3e8ee9a@postgresql.org Backpatch-through: 15, where 9d9c02ccd was introduced
* Fix macro definitions in pgstatfuncs.cMichael Paquier2022-12-10
| | | | | | | | | | | | | Buildfarm member wrasse has been complaining about empty declarations as an effect of 8018ffb and 83a1a1b due to extra semicolons. While on it, remove also the last backslash of the macros definitions, causing more lines to be eaten in it than necessary, per comment from Tom Lane. Reported-by: Tom Lane, and buildfarm member wrasse Author: Nathan Bossart, Michael Paquier Discussion: https://postgr.es/m/1188769.1670640236@sss.pgh.pa.us
* Restructure soft-error handling in formatting.c.Tom Lane2022-12-09
| | | | | | | | | | | | | Replace the error trapping scheme introduced in 5bc450629 with our shiny new errsave/ereturn mechanism. This doesn't have any real functional impact (although I think that the new coding is able to report a few more errors softly than v15 did). And I doubt there's any measurable performance difference either. But this gets rid of an ad-hoc, one-of-a-kind design in favor of a mechanism that will be widely used going forward, so it should be a net win for code readability. Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
* Convert datetime input functions to use "soft" error reporting.Tom Lane2022-12-09
| | | | | | | | | This patch converts the input functions for date, time, timetz, timestamp, timestamptz, and interval to the new soft-error style. There's some related stuff in formatting.c that remains to be cleaned up, but that seems like a separable project. Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
* Allow DateTimeParseError to handle bad-timezone error messages.Tom Lane2022-12-09
| | | | | | | | | | | | | | | | | | | | | | | | Pay down some ancient technical debt (dating to commit 022fd9966): fix a couple of places in datetime parsing that were throwing ereport's immediately instead of returning a DTERR code that could be interpreted by DateTimeParseError. The reason for that was that there was no mechanism for passing any auxiliary data (such as a zone name) to DateTimeParseError, and these errors seemed to really need it. Up to now it didn't matter that much just where the error got thrown, but now we'd like to have a hard policy that datetime parse errors get thrown from just the one place. Hence, invent a "DateTimeErrorExtra" struct that can be used to carry any extra values needed for specific DTERR codes. Perhaps in the future somebody will be motivated to use this to improve the specificity of other DateTimeParseError messages, but for now just deal with the timezone-error cases. This is on the way to making the datetime input functions report parse errors softly; but it's really an independent change, so commit separately. Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
* Const-ify a couple of datetime parsing subroutines.Tom Lane2022-12-09
| | | | | | More could be done in this line, but I just grabbed some low-hanging fruit. Principal objective was to remove the need for several ugly unconstify() usages in formatting.c.