aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
...
* Windows support in pg_import_system_collationsPeter Eisentraut2023-01-03
| | | | | | | | | | | | | | | | | | | | | Windows can enumerate the locales that are either installed or supported by calling EnumSystemLocalesEx(), similar to what is already done in the READ_LOCALE_A_OUTPUT switch. We can refactor some of the logic already used in that switch into a new function create_collation_from_locale(). The enumerated locales have BCP 47 shape, that is with a hyphen between language and territory, instead of POSIX's underscore. The created collations will retain the BCP 47 shape, but we will also create a POSIX alias, so xx-YY will have an xx_YY alias. A new test collate.windows.win1252 is added that is like collate.linux.utf8. Author: Juan Jose Santamaria Flecha <juanjo.santamaria@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/0050ec23-34d9-2765-9015-98c04f0e18ac@postgrespro.ru
* Fix typos in comments, code and documentationMichael Paquier2023-01-03
| | | | | | | | | | While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since a7e5457. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11
* Avoid reference to nonexistent array element in ExecInitAgg().Tom Lane2023-01-02
| | | | | | | | | | | | | | | | When considering an empty grouping set, we fetched phasedata->eqfunctions[-1]. Because the eqfunctions array is palloc'd, that would always be an aset pointer in released versions, and thus the code accidentally failed to malfunction (since it would do nothing unless it found a null pointer). Nonetheless this seems like trouble waiting to happen, so add a check for length == 0. It's depressing that our valgrind testing did not catch this. Maybe we should reconsider the choice to not mark that word NOACCESS? Richard Guo Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Adjust VACUUM hastup LP_REDIRECT comments.Peter Geoghegan2023-01-02
| | | | | | | The term "truncation" has been ambiguous since commit 10a8d13823 added line pointer array truncation during heap pruning. Clear things up by specifying that we're talking about rel truncation here, to match nearby comments that apply to tuples with storage.
* Avoid special XID snapshotConflictHorizon values.Peter Geoghegan2023-01-02
| | | | | | | | | | | | | Don't allow VACUUM to WAL-log the value FrozenTransactionId as the snapshotConflictHorizon of freezing or visibility map related WAL records. The only special XID value that's an allowable snapshotConflictHorizon is InvalidTransactionId, which is interpreted as "record definitely doesn't require a recovery conflict". Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznuNGSzF8v6OsgjaC5aYsb3cZ6HW6MLm30X0d65cmSH6A@mail.gmail.com
* Push lpp variable closer to usage in heapgetpage()Peter Eisentraut2023-01-02
| | | | | Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
* Accept "+infinity" in date and timestamp[tz] input.Tom Lane2023-01-01
| | | | | | | | | The float and numeric types accept this variant spelling of "infinity", so it seems like the datetime types should too. Vik Fearing, some cosmetic mods by me Discussion: https://postgr.es/m/d0bef637-2dbd-0a5d-e539-48243b6f6c5e@postgresfriends.org
* Fix assert in BRIN build_distancesTomas Vondra2022-12-30
| | | | | | | | | | | | | | | | | | | | | When brin_minmax_multi_union merges summaries, we may end up with just a single range after merge_overlapping_ranges. The summaries may contain just one range each, and they may overlap (or be exactly the same). With a single range there's no distance to calculate, but we happen to call build_distances anyway - which is fine, we don't calculate the distance in this case, except that with asserts this failed due to a check there are at least two ranges. The assert is unnecessarily strict, so relax it a bit and bail out if there's just a single range. The relaxed assert would be enough, but this way we don't allocate unnecessary memory for distance. Backpatch to 14, where minmax-multi opclasses were introduced. Reported-by: Jaime Casanova Backpatch-through: 14 Discussion: https://postgr.es/m/YzVA55qS0hgz8P3r@ahch-to
* Fix precision handling for some COERCE_SQL_SYNTAX functionsMichael Paquier2022-12-30
| | | | | | | | | | | | | | | | | | | | | | | f193883 has been incorrectly setting up the precision used in the timestamp compilations returned by the following functions: - LOCALTIME - LOCALTIMESTAMP - CURRENT_TIME - CURRENT_TIMESTAMP Specifying an out-of-range precision for CURRENT_TIMESTAMP and LOCALTIMESTAMP was raising a WARNING without adjusting the precision, leading to a subsequent error. LOCALTIME and CURRENT_TIME raised a WARNING without an error, still the precision given to the internal routines was not correct, so let's be clean. Ian has reported the problems in timestamp.c, while I have noticed the ones in date.c. Regression tests are added for all of them with precisions high enough to provide coverage for the warnings, something that went missing up to this commit. Author: Ian Lawrence Barwick, Michael Paquier Discussion: https://postgr.es/m/CAB8KJ=jQEnn9sYG+N752spt68wMrhmT-ocHCh4oeNmHF82QMWA@mail.gmail.com
* Change argument of appendBinaryStringInfo from char * to void *Peter Eisentraut2022-12-30
| | | | | | | | | There is some code that uses this function to assemble some kind of packed binary layout, which requires a bunch of casts because of this. Functions taking binary data plus length should take void * instead, like memcpy() for example. Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
* Use appendStringInfoString instead of appendBinaryStringInfo where possiblePeter Eisentraut2022-12-30
| | | | | | | | | For the jsonpath output, we don't need to squeeze out every bit of performance, so instead use a more robust coding style. There are similar calls in jsonb.c, which we leave alone here since there is indeed a performance impact for bulk exports. Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
* Add const to BufFileWritePeter Eisentraut2022-12-30
| | | | | | | | Make data buffer argument to BufFileWrite a const pointer and bubble this up to various callers and related APIs. This makes the APIs clearer and more consistent. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
* Remove unnecessary castsPeter Eisentraut2022-12-30
| | | | | | | | Some code carefully cast all data buffer arguments for data write and read function calls to void *, even though the respective arguments are already void *. Remove this unnecessary clutter. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
* Add page-level freezing to VACUUM.Peter Geoghegan2022-12-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Teach VACUUM to decide on whether or not to trigger freezing at the level of whole heap pages. Individual XIDs and MXIDs fields from tuple headers now trigger freezing of whole pages, rather than independently triggering freezing of each individual tuple header field. Managing the cost of freezing over time now significantly influences when and how VACUUM freezes. The overall amount of WAL written is the single most important freezing related cost, in general. Freezing each page's tuples together in batch allows VACUUM to take full advantage of the freeze plan WAL deduplication optimization added by commit 9e540599. Also teach VACUUM to trigger page-level freezing whenever it detects that heap pruning generated an FPI. We'll have already written a large amount of WAL just to do that much, so it's very likely a good idea to get freezing out of the way for the page early. This only happens in cases where it will directly lead to marking the page all-frozen in the visibility map. In most cases "freezing a page" removes all XIDs < OldestXmin, and all MXIDs < OldestMxact. It doesn't quite work that way in certain rare cases involving MultiXacts, though. It is convenient to define "freeze the page" in a way that gives FreezeMultiXactId the leeway to put off the work of processing an individual tuple's xmax whenever it happens to be a MultiXactId that would require an expensive second pass to process aggressively (allocating a new multi is especially worth avoiding here). FreezeMultiXactId is eager when processing is cheap (as it usually is), and lazy in the event of an individual multi that happens to require expensive second pass processing. This avoids regressions related to processing of multis that page-level freezing might otherwise cause. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
* Suppress uninitialized-variable warning from a61b1f748.Tom Lane2022-12-27
| | | | | | | | | | | | Some compilers complain about sub_rteperminfos not being initialized, evidently because they don't detect that it is only used and set if isGeneralSelect is true. Make it follow the long-established pattern for its sibling variable sub_rtable. Per reports from Pavel Stehule and the buildfarm. Discussion: https://postgr.es/m/CAFj8pRDOvGOi-n616kM0Cc7qSbg_nGoS=-haB+D785sUXADqSg@mail.gmail.com
* Remove new locale dependency in regproc regression test.Tom Lane2022-12-27
| | | | | | | | | | | The modified error message for regcollationin failure includes the database encoding, which it should've occurred to me is a portability hazard for the regression tests. Adjust the test so the expected output doesn't include that. In passing, fix a comment typo introduced in b8c0ffbd2. Per buildfarm.
* Simplify the implementations of the to_reg* functions.Tom Lane2022-12-27
| | | | | | | | | | | | | | | Given the soft-input-error feature, we can reduce these functions to be just thin wrappers around a soft-error call of the corresponding datatype input function. This means less code and more certainty that the to_reg* functions match the normal input behavior. Notably, it also means that they will accept numeric OID input, which they didn't before. It's not clear to me if that omission had more than laziness behind it, but it doesn't seem like something we need to work hard to preserve. Discussion: https://postgr.es/m/3910031.1672095600@sss.pgh.pa.us
* Convert the reg* input functions to report (most) errors softly.Tom Lane2022-12-27
| | | | | | | | | | | | | | | | | | | | | | | | This is not really complete, but it catches most cases of practical interest. The main omissions are: * regtype, regprocedure, and regoperator parse type names by calling the main grammar, so any grammar-detected syntax error will still be a hard error. Also, if one includes a type modifier in such a type specification, errors detected by the typmodin function will be hard errors. * Lookup errors are handled just by passing missing_ok = true to the relevant catalog lookup function. Because we've used quite a restrictive definition of "missing_ok", this means that edge cases such as "the named schema exists, but you lack USAGE permission on it" are still hard errors. It would make sense to me to replace most/all missing_ok parameters with an escontext parameter and then allow these additional lookup failure cases to be trapped too. But that's a job for some other day. Discussion: https://postgr.es/m/3342239.1671988406@sss.pgh.pa.us
* Convert tsqueryin and tsvectorin to report errors softly.Tom Lane2022-12-27
| | | | | | | | | | | | | | | This is slightly tedious because the adjustments cascade through a couple of levels of subroutines, but it's not very hard. I chose to avoid changing function signatures more than absolutely necessary, by passing the escontext pointer in existing structs where possible. tsquery's nuisance NOTICEs about empty queries are suppressed in soft-error mode, since they're not errors and we surely don't want them to be shown to the user anyway. Maybe that whole behavior should be reconsidered. Discussion: https://postgr.es/m/3824377.1672076822@sss.pgh.pa.us
* Detect bad input for types xid, xid8, and cid.Tom Lane2022-12-27
| | | | | | | | | | | | | | | | | | | | | | Historically these input functions just called strtoul or strtoull and returned the result, with no error detection whatever. Upgrade them to reject garbage input and out-of-range values, similarly to our other numeric input routines. To share the code for this with type oid, adjust the existing "oidin_subr" to be agnostic about the SQL name of the type it is handling, and move it to numutils.c; then clone it for 64-bit types. Because the xid types previously accepted hex and octal input by reason of calling strtoul[l] with third argument zero, I made the common subroutine do that too, with the consequence that type oid now also accepts hex and octal input. In view of 6fcda9aba, that seems like a good thing. While at it, simplify the existing over-complicated handling of syntax errors from strtoul: we only need one ereturn not three. Discussion: https://postgr.es/m/3526121.1672000729@sss.pgh.pa.us
* Remove overzealous MultiXact freeze assertion.Peter Geoghegan2022-12-26
| | | | | | | | | | | | | When VACUUM determines that an existing MultiXact should use a freeze plan that sets xmax to InvalidTransactionId, the original Multi may or may not be before OldestMxact. Remove an incorrect assertion that expected it to always be from before OldestMxact. Oversight in commit 4ce3af. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/TYAPR01MB5866B24104FD80B5D7E65C3EF5ED9@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Add 'logical_decoding_mode' GUC.Amit Kapila2022-12-26
| | | | | | | | | | | | | | | | This enables streaming or serializing changes immediately in logical decoding. This parameter is intended to be used to test logical decoding and replication of large transactions for which otherwise we need to generate the changes till logical_decoding_work_mem is reached. This helps in reducing the timing of existing tests related to logical replication of in-progress transactions and will help in writing tests for for the upcoming feature for parallelly applying large in-progress transactions. Author: Shi yu Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Convert enum_in() to report errors softly.Tom Lane2022-12-25
| | | | | | | | I missed this in my initial survey, probably because I examined the contents of pg_type in the postgres database, which lacks any enumerated types. Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com
* Convert jsonpath's input function to report errors softlyAndrew Dunstan2022-12-24
| | | | | | Reviewed by Tom Lane Discussion: https://postgr.es/m/a8dc5700-c341-3ba8-0507-cc09881e6200@dunslane.net
* Make the numeric-OID cases of regprocin and friends be non-throwing.Tom Lane2022-12-24
| | | | | | | | | While at it, use a common subroutine already. This doesn't move the needle very far in terms of making these functions non-throwing; the only case we're now able to trap is numeric-OID-is-out-of-range. Still, it seems like a pretty non-controversial step in that direction.
* Fix bug in translate_col_privs_multilevelDavid Rowley2022-12-24
| | | | | | | | | | | | | Fix incorrect code which was trying to convert a Bitmapset of columns at the attnums according to a parent table and transform them into the equivalent Bitmapset with same attnums according to the given child table. This code is new as of a61b1f748 and was failing to do the correct translation when there was an intermediate parent table between 'rel' and 'top_parent_rel'. Reported-by: Ranier Vilela Author: Richard Guo, Amit Langote Discussion: https://postgr.es/m/CAEudQArohfB_Gy%2BhcH2-bANUkxgjJiP%3DABq01_LgTNTbcNijag%40mail.gmail.com
* Allow parent's WaitEventSets to be freed after fork().Thomas Munro2022-12-23
| | | | | | | | | | | | | An epoll fd belonging to the parent should be closed in the child. A kqueue fd is automatically closed by fork(), but we should still adjust our counter. For poll and Windows systems, nothing special is required. On all systems we free the memory. No caller yet, but we'll need this if we start using WaitEventSet in the postmaster as planned. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
* Don't leak a signalfd when using latches in the postmaster.Thomas Munro2022-12-23
| | | | | | | | | | | At the time of commit 6a2a70a02 we didn't use latch infrastructure in the postmaster. We're planning to start doing that, so we'd better make sure that the signalfd inherited from a postmaster is not duplicated and then leaked in the child. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
* Add WL_SOCKET_ACCEPT event to WaitEventSet API.Thomas Munro2022-12-23
| | | | | | | | | | | | | | | To be able to handle incoming connections on a server socket with the WaitEventSet API, we'll need a new kind of event to indicate that the the socket is ready to accept a connection. On Unix, it's just the same as WL_SOCKET_READABLE, but on Windows there is a different underlying kernel event that we need to map our abstraction to. No user yet, but a proposed patch would use this. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
* Fix come incorrect elog() messages in aclchk.cMichael Paquier2022-12-23
| | | | | | | | | | | | | | Three error strings used with cache lookup failures were referring to incorrect object types for ACL checks: - Schemas - Types - Foreign Servers There errors should never be triggered, but if they do incorrect information would be reported. Author: Justin Pryzby Discussion: https://postgr.es/m/20221222153041.GN1153@telsasoft.com Backpatch-through: 11
* Rename pg_dissect_walfile_name() to pg_split_walfile_name()Michael Paquier2022-12-23
| | | | | | | | | | | | | | | The former name was discussed as being confusing, so use "split", as per a suggestion from Magnus Hagander. While on it, one of the output arguments is renamed from "segno" to "segment_number", as per a suggestion from Kyotaro Horiguchi. The documentation is updated to reflect all these changes. Bump catalog version. Author: Bharath Rupireddy, Michael Paquier Discussion: https://postgr.es/m/CABUevEytQVaOOhGdoh0D7hGwe3fuKcRF6NthsSW7ww04EmtFgQ@mail.gmail.com
* Allow window functions to adjust their frameOptionsDavid Rowley2022-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | WindowFuncs such as row_number() don't care if it's called with ROWS UNBOUNDED PRECEDING AND CURRENT ROW or with RANGE UNBOUNDED PRECEDING AND CURRENT ROW. The latter is less efficient as the RANGE option requires that the executor check for peer rows, so using the ROW option instead would cause less overhead. Because RANGE is part of the default frame options for WindowClauses, it means WindowAgg is, by default, working much harder than it needs to for window functions where the ROWS / RANGE option has no effect on the window function's result. On a test query from the discussion thread, a performance improvement of 344% was seen by using ROWS instead of RANGE. Here we add a new support function node type to allow support functions to be called for window functions so that the most optimal version of the frame options can be set. The planner has been adjusted so that the frame options are changed only if all window functions sharing the same window clause agree on what the optimized frame options are. Here we give the ability for row_number(), rank(), dense_rank(), percent_rank(), cume_dist() and ntile() to alter their WindowClause's frameOptions. Reviewed-by: Vik Fearing, Erwin Brandstetter, Zhihong Yu Discussion: https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6+S88z+A@mail.gmail.com
* Improve notation of cacheinfo table in syscache.c.Thomas Munro2022-12-23
| | | | | | | | | | Use C99 designated initializer syntax for the array elements, instead of writing the enumerator name and position in a comment. Replace nkeys and key with a local variadic macro, for a shorter notation. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/CA%2BhUKGKdpDjKL2jgC-GpoL4DGZU1YPqnOFHbDqFkfRQcPaR5DQ%40mail.gmail.com
* Use scanned_pages to decide when to failsafe check.Peter Geoghegan2022-12-22
| | | | | | | | | | | | | | | | Perform a failsafe check every time VACUUM's first heap scan scans a further FAILSAFE_EVERY_PAGES pages, rather than using an approach based on the number of physical blocks that our current blkno is from the blkno at the time of the previous failsafe check. That way VACUUM will perform a failsafe check every time it has scanned a uniform number of pages, without it mattering when or how VACUUM skipped pages using the visibility map. Sami Imseih, with changes to FAILSAFE_EVERY_PAGES comments added by me. Author: Sami Imseih <simseih@amazon.com> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/401CE010-4049-4B94-9961-0B610A5D254D%40amazon.com
* Refactor how VACUUM passes around its XID cutoffs.Peter Geoghegan2022-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use a dedicated struct for the XID/MXID cutoffs used by VACUUM, such as FreezeLimit and OldestXmin. This state is initialized in vacuum.c, and then passed around by code from vacuumlazy.c to heapam.c freezing related routines. The new convention is that everybody works off of the same cutoff state, which is passed around via pointers to const. Also simplify some of the logic for dealing with frozen xmin in heap_prepare_freeze_tuple: add dedicated "xmin_already_frozen" state to clearly distinguish xmin XIDs that we're going to freeze from those that were already frozen from before. That way the routine's xmin handling code is symmetrical with the existing xmax handling code. This is preparation for an upcoming commit that will add page level freezing. Also refactor the control flow within FreezeMultiXactId(), while adding stricter sanity checks. We now test OldestXmin directly, instead of using FreezeLimit as an inexact proxy for OldestXmin. This is further preparation for the page level freezing work, which will make the function's caller cede control of page level freezing to the function where appropriate (where heap_prepare_freeze_tuple sees a tuple that happens to contain a MultiXactId in its xmax). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CAH2-WznS9TxXmz2_=SY+SyJyDFbiOftKofM9=aDo68BbXNBUMA@mail.gmail.com
* Avoid O(N^2) cost when pulling up lots of UNION ALL subqueries.Tom Lane2022-12-22
| | | | | | | | | | | | | | | | | | | perform_pullup_replace_vars() knows how to scan the whole parent query tree when we are replacing Vars during a subquery flattening operation. However, for the specific case of flattening a UNION ALL leaf query, that's mostly wasted work: the only place where relevant Vars could exist is in the AppendRelInfo that we just made for this leaf. Teaching perform_pullup_replace_vars() to just deal with that and exit is worthwhile because, if we have N such subqueries to pull up, we were spending O(N^2) work uselessly mutating the AppendRelInfos for all the other subqueries. While we're at it, avoid calling substitute_phv_relids if there are no PlaceHolderVars, and remove an obsolete check of parse->hasSubLinks. Andrey Lepikhov and Tom Lane Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
* Add some recursion and looping defenses in prepjointree.c.Tom Lane2022-12-22
| | | | | | | | | | | | | Andrey Lepikhov demonstrated a case where we spend an unreasonable amount of time in pull_up_subqueries(). Not only is that recursing with no explicit check for stack overrun, but the code seems not interruptable by control-C. Let's stick a CHECK_FOR_INTERRUPTS there, along with sprinkling some stack depth checks. An actual fix for the excessive time consumption seems a bit risky to back-patch; but this isn't, so let's do so. Discussion: https://postgr.es/m/703c09a2-08f3-d2ec-b33d-dbecd62428b8@postgrespro.ru
* Remove dead codePeter Eisentraut2022-12-22
| | | | | The second appearance of NamespaceRelationId in this if-else chain is in error and can be removed.
* Fix operator typo in tablecmds.cMichael Paquier2022-12-22
| | | | | | | | | | | | | | A bitwise operator was getting used on two bools in ATAddCheckConstraint() to track if constraints should be merged or not with the existing ones of a relation, though obviously this should use a boolean OR operator. This led to the same result, but let's be clean. Oversight in 074c5cf. Author: Ranier Vilela Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAEudQAp2R2fbbi0OHHhv_n4=Ch0t1VtjObR9YMqtGKHJ+faUFQ@mail.gmail.com
* Add palloc_aligned() to allow aligned memory allocationsDavid Rowley2022-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces palloc_aligned() and MemoryContextAllocAligned() which allow callers to obtain memory which is allocated to the given size and also aligned to the specified alignment boundary. The alignment boundaries may be any power-of-2 value. Currently, the alignment is capped at 2^26, however, we don't expect values anything like that large. The primary expected use case is to align allocations to perhaps CPU cache line size or to maybe I/O page size. Certain use cases can benefit from having aligned memory by either having better performance or more predictable performance. The alignment is achieved by requesting 'alignto' additional bytes from the underlying allocator function and then aligning the address that is returned to the requested alignment. This obviously does waste some memory, so alignments should be kept as small as what is required. It's also important to note that these alignment bytes eat into the maximum allocation size. So something like: palloc_aligned(MaxAllocSize, 64, 0); will not work as we cannot request MaxAllocSize + 64 bytes. Additionally, because we're just requesting the requested size plus the alignment requirements from the given MemoryContext, if that context is the Slab allocator, then since slab can only provide chunks of the size that's specified when the slab context is created, then this is not going to work. Slab will generate an error to indicate that the requested size is not supported. The alignment that is requested in palloc_aligned() is stored along with the allocated memory. This allows the alignment to remain intact through repalloc() calls. Author: Andres Freund, David Rowley Reviewed-by: Maxim Orlov, Andres Freund, John Naylor Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com
* Introduce float4in_internalAndrew Dunstan2022-12-21
| | | | | | | | | | | | | This is the guts of float4in, callable as a routine to input floats, which will be useful in an upcoming patch for allowing soft errors in the seg module's input function. A similar operation was performed some years ago for float8in in commit 50861cd683e. Reviewed by Tom Lane Discussion: https://postgr.es/m/cee4e426-d014-c0b7-aa22-a659f2cd9130@dunslane.net
* Fix newly introduced bug in slab.cDavid Rowley2022-12-22
| | | | | | | | | | | | | | | | | | | | d21ded75f changed the way slab.c works but introduced a bug that meant we could end up with the slab's curBlocklistIndex pointing to the wrong list. The condition which was checking for this was failing to account for two things: 1. The curBlocklistIndex could be 0 as we've currently got no non-full blocks to put chunks on. In this case, the dlist_is_empty() check cannot be performed as there can be any number of completely full blocks at that index. 2. The curBlocklistIndex may be greater than the index we just moved the block onto. Since we need to ensure we fill up fuller blocks first, we must reset curBlocklistIndex when changing any blocklist element that's less than the curBlocklistIndex too. Reported-by: Takamichi Osumi Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com
* Switch some system functions to use get_call_result_type()Michael Paquier2022-12-21
| | | | | | | | | | | | | | | | | This shaves some code by replacing the combinations of CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of the attributes listed in pg_proc.dat by get_call_result_type() to build the TupleDesc needed for the rows generated. get_call_result_type() is more expensive than the former style, but this removes some duplication with the lists of OUT parameters (pg_proc.dat and the attributes hardcoded in these code paths). This is applied to functions that are not considered as critical (aka that could be called repeatedly for monitoring purposes). Author: Bharath Rupireddy Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com
* Add copyright notices to meson filesAndrew Dunstan2022-12-20
| | | | Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
* Allow batching of inserts during cross-partition updates.Etsuro Fujita2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 927f453a9 disallowed batching added by commit b663a4136 to be used for the inserts performed as part of cross-partition updates of partitioned tables, mainly because the previous code in nodeModifyTable.c couldn't handle pending inserts into foreign-table partitions that are also UPDATE target partitions. But we don't have such a limitation anymore (cf. commit ffbb7e65a), so let's allow for this by removing from execPartition.c the restriction added by commit 927f453a9 that batching is only allowed if the query command type is CMD_INSERT. In postgres_fdw, since commit 86dc90056 changed it to effectively disable cross-partition updates in the case where a foreign-table partition chosen to insert rows into is also an UPDATE target partition, allow batching in the case where a foreign-table partition chosen to do so is *not* also an UPDATE target partition. This is enabled by the "batch_size" option added by commit b663a4136, which is disabled by default. This patch also adjusts the test case added by commit 927f453a9 to confirm that the inserts performed as part of a cross-partition update of a partitioned table indeed uses batching. Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu, Bharath Rupireddy, Hou Zhijie, Vignesh C, and me. Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
* Add enable_presorted_aggregate GUCDavid Rowley2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1349d279 added query planner support to allow more efficient execution of aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to that commit, the planner would only request that the lower planner produce a plan with the order required for the GROUP BY clause and it would be left up to nodeAgg.c to perform the final sort of records within each group so that the aggregate transition functions were called in the correct order. Now that the planner requests the lower planner produce a plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind, there is the possibility that the planner chooses a plan which could be less efficient than what would have been produced before 1349d279. While developing 1349d279, I had in mind that Incremental Sort would help us in cases where an index exists only on the GROUP BY column(s). Incremental Sort would just replace the implicit tuplesorts which are being performed in nodeAgg.c. However, because the planner has the flexibility to instead choose a plan which just performs a full sort on both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is potential for the planner to make a bad choice. The costing for Incremental Sort is not perfect as it assumes an even distribution of rows to sort within each sort group. Here we add an escape hatch in the form of the enable_presorted_aggregate GUC. This will allow users to get the pre-PG16 behavior in cases where they have no other means to convince the query planner to produce a plan which only sorts on the GROUP BY column(s). Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com
* Improve the performance of the slab memory allocatorDavid Rowley2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Slab has traditionally been fairly slow when compared with the AllocSet or Generation memory allocators. Part of this slowness came from having to write out an entire block when we allocate a new block in order to populate the free list indexes within the block's memory. Additional slowness came from having to move a block onto another dlist each time we palloc or pfree a chunk from it. Here we optimize both of those cases and do a little bit extra to improve the performance of the slab allocator. Here, instead of writing out the free list indexes when allocating a new block, we introduce the concept of "unused" chunks. When a block is first allocated all chunks are unused. These chunks only make it onto the free list when they are pfree'd. When allocating new chunks on an existing block, we have the choice of consuming a chunk from the free list or an unused chunk. When both exist, we opt to use one from the free list, as these have been used already and the memory of them is more likely to be cached by the CPU. Here we also reduce the number of block lists from there being one for every possible value of free chunks on a block to just having a small fixed number of block lists. We keep the 0th block list for completely full blocks and anything else stores blocks for some range of free chunks with fuller blocks appearing on lower block list array elements. This reduces how often we must move a block to another list when we allocate or free chunks, but still allows us to prefer to put new chunks on fuller blocks and perhaps allow blocks with fewer chunks to be free'd later once all their remaining chunks have been pfree'd. Additionally, we now store a list of "emptyblocks", which are blocks that no longer contain any allocated chunks. We now keep up to 10 of these around to avoid having to thrash malloc/free when allocation patterns continually cause blocks to become free of any allocated chunks only to allocate more chunks again. Now only once we have 10 of these, we free the block. This does raise the high water mark for the total memory that a slab context can consume. It does not seem entirely unreasonable that we might one day want to make this a property of SlabContext rather than a compile-time constant. Let's wait and see if there is any evidence to support that this is required before doing it. Author: Andres Freund, David Rowley Tested-by: Tomas Vondra, John Naylor Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de
* Move variable increment to the end of the loopJohn Naylor2022-12-20
| | | | | | | | | | This is less error prone and matches the placement of other code in the file. Justin Pryzby Reviewed by Tom Lane Discussion: https://www.postgresql.org/message-id/20221123172436.GJ11463@telsasoft.com
* Add pg_dissect_walfile_name()Michael Paquier2022-12-20
| | | | | | | | | | | | | | | | | | | This function takes in input a WAL segment name and returns a tuple made of the segment sequence number (dependent on the WAL segment size of the cluster) and its timeline, as of a thin SQL wrapper around the existing XLogFromFileName(). This function has multiple usages, like being able to compile a LSN from a file name and an offset, or finding the timeline of a segment without having to do to some maths based on the first eight characters of the segment. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael Paquier Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com