aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt
Commit message (Collapse)AuthorAge
...
* Fix issues in binary_upgrade_logical_slot_has_caught_up().Amit Kapila2023-12-07
| | | | | | | | | | | | | | | | | The commit 29d0a77fa6 labelled binary_upgrade_logical_slot_has_caught_up() as a non-strict function to allow providing a better error message to callers in case the passed slot_name is NULL. On further discussion, it seems that it is not helpful to have a different error message for NULL input in this function, so this patch marks the function as strict. This patch also removes the explicit permission check to use replication slots as this function is invoked only by superusers and instead adds an Assert. Reported-by: Masahiko Sawada Author: Hayato Kuroda Reviewed-by: Vignesh C Discussion: https://postgr.es/m/CAD21AoDSyiBKkMXBxN_gUayZZUCOgyHnG8Ge8rcPXNP3Tf6B4g@mail.gmail.com
* Apply quotes more consistently to GUC names in logsMichael Paquier2023-11-30
| | | | | | | | | | | | | | Quotes are applied to GUCs in a very inconsistent way across the code base, with a mix of double quotes or no quotes used. This commit removes double quotes around all the GUC names that are obviously referred to as parameters with non-English words (use of underscore, mixed case, etc). This is the result of a discussion with Álvaro Herrera, Nathan Bossart, Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
* Don't use bms_membership() in cases where we don't need toDavid Rowley2023-11-28
| | | | | | | | | | | | | | | | | | | | | | 00b41463c adjusted Bitmapset so that an empty set is always represented as NULL. This makes checking for empty sets far cheaper than it used to be. There were various places in the code where we'd call bms_membership() to handle the 3 possible BMS_Membership values. For the BMS_SINGLETON case, we'd also call bms_singleton_member() to find the single set member. This can now be done in a more optimal way by first checking if the set is NULL and then not bothering with bms_membership() and simply call bms_get_singleton_member() instead to find the single member. This function will return false if there are multiple members in the set. Here we also tidy up some logic in examine_variable() for the single member case. There's now no need to call bms_is_member() as we've already established that we're working with a singleton Bitmapset, so we can just check if varRelid matches the singleton member. Reviewed-by: Richard Guo Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
* Guard against overflow in interval_mul() and interval_div().Dean Rasheed2023-11-18
| | | | | | | | | | | | | | | | | | | | Commits 146604ec43 and a898b409f6 added overflow checks to interval_mul(), but not to interval_div(), which contains almost identical code, and so is susceptible to the same kinds of overflows. In addition, those checks did not catch all possible overflow conditions. Add additional checks to the "cascade down" code in interval_mul(), and copy all the overflow checks over to the corresponding code in interval_div(), so that they both generate "interval out of range" errors, rather than returning bogus results. Given that these errors are relatively easy to hit, back-patch to all supported branches. Per bug #18200 from Alexander Lakhin, and subsequent investigation. Discussion: https://postgr.es/m/18200-5ea288c7b2d504b1%40postgresql.org
* Extract column statistics from CTE references, if possible.Tom Lane2023-11-17
| | | | | | | | | | | | | | | | | | | | | | examine_simple_variable() left this as an unimplemented case years ago, with the result that plans for queries involving un-flattened CTEs might be much stupider than necessary. It's not hard to extend the existing logic for RTE_SUBQUERY cases to also be able to drill down into CTEs, so let's do that. There was some discussion of whether this patch breaks the idea of a MATERIALIZED CTE being an optimization fence. We concluded it's okay, because we already allow the outer planner level to see the estimated width and rowcount of the CTE result, and letting it see column statistics too seems fairly equivalent. Basically, what we expect of the optimization fence is that the outer query should not affect the plan chosen for the CTE query. Once that plan is chosen, it's okay for the outer planner level to make use of whatever information we have about it. Jian Guo and Tom Lane, per complaint from Hans Buschmann Discussion: https://postgr.es/m/4504e67078d648cdac3651b2960da6e7@nidsa.net
* Don't specify number of dimensions in cases where we don't know it.Tom Lane2023-11-17
| | | | | | | | | | | | | | | A few places in array_in() and plperl would report a misleading value (always MAXDIM+1) for the number of dimensions in the input, because we'd error out as soon as that was clearly too large rather than scanning the entire input. There doesn't seem to be much value in offering the true number, at least not enough to justify the extra complication involved in trying to get it. So just remove that parenthetical remark. We already have other places that do it like that, anyway. Per suggestions from Alexander Lakhin and Heikki Linnakangas. Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
* Add target "slru" to pg_stat_reset_shared()Michael Paquier2023-11-16
| | | | | | | | | | | | | | | | Currently, pg_stat_reset_shared() cannot reset the counters in the view pg_stat_slru even if it is a type of shared stats. This patch adds support for a new value in pg_stat_reset_shared(), called "slru", able to do that. Note that pg_stat_reset_shared(NULL) also resets SLRU counters. There may be a point in removing pg_stat_reset_slru() that was introduced in 28cac71bd368 (v13~) as the new option overlaps with this function, but we would lose the ability to reset individual SLRU counters. This is left for future reconsideration. Author: Atsushi Torikoshi Discussion: https://postgr.es/m/e3c25d72e81378e7b64f3c52e0306fc9@oss.nttdata.com
* Support +/- infinity in the interval data type.Dean Rasheed2023-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | This adds support for infinity to the interval data type, using the same input/output representation as the other date/time data types that support infinity. This allows various arithmetic operations on infinite dates, timestamps and intervals. The new values are represented by setting all fields of the interval to INT32/64_MIN for -infinity, and INT32/64_MAX for +infinity. This ensures that they compare as less/greater than all other interval values, without the need for any special-case comparison code. Note that, since those 2 values were formerly accepted as legal finite intervals, pg_upgrade and dump/restore from an old database will turn them from finite to infinite intervals. That seems OK, since those exact values should be extremely rare in practice, and they are outside the documented range supported by the interval type, which gives us a certain amount of leeway. Bump catalog version. Joseph Koshakow, Jian He, and Ashutosh Bapat, reviewed by me. Discussion: https://postgr.es/m/CAAvxfHea4%2BsPybKK7agDYOMo9N-Z3J6ZXf3BOM79pFsFNcRjwA%40mail.gmail.com
* Improve readability and error detection of array_in().Tom Lane2023-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite array_in() and its subroutines so that we make only one pass over the input text, rather than two. This requires potentially re-pallocing the working arrays values[] and nulls[] larger than our initial guess, but that cost will hopefully be made up by avoiding duplicate parsing. In any case this coding seems much clearer and more straightforward than what we had before. This also fixes array_in() to reject non-rectangular input (that is, different brace depths in different parts of the input) more reliably than before, and to give a better error message when it does so. This is analogous to the plpython and plperl fixes in 0553528e7 and f47004add. Like those PLs, we now accept input such as '{{},{}}' as a valid representation of an empty array, which we did not before. Additionally, reject explicit array subscripts that are outside the integer range (previously you just got whatever atoi() converted them to), and make some other minor improvements in error reporting. Although this is arguably a bug fix, it's also a behavioral change that might trip somebody up, so no back-patch. Tom Lane, Heikki Linnakangas, and Jian He. Thanks to Alexander Lakhin for the initial report and for review/testing. Discussion: https://postgr.es/m/2794005.1683042087@sss.pgh.pa.us
* Add ability to reset all shared stats types in pg_stat_reset_shared()Michael Paquier2023-11-12
| | | | | | | | | | | | | | | | | | | | | | Currently, pg_stat_reset_shared() can use an argument to specify the target of statistics to reset, doing nothing for NULL as it is strict. This patch adds to pg_stat_reset_shared() the possibility to reset all the stats types already handled in this function rather than do nothing if the argument value given is NULL or if nothing is specified (proisstrict is switched to false). Like previously, SLRUs are not included in what gets reset. The idea to use NULL or no argument to control if all the shared stats already covered by this function should be reset has been proposed by Andres Freund. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy, Matthias van de Meent Discussion: https://postgr.es/m/4291a55137ddda77cf7cc5f46e846daf@oss.nttdata.com
* Avoid integer overflow hazard in interval_time().Dean Rasheed2023-11-09
| | | | | | | | | | | | When casting an interval to a time, the original code suffered from 64-bit integer overflow for inputs with a sufficiently large negative "time" field, leading to bogus results. Fix by rewriting the algorithm in a simpler form, that more obviously cannot overflow. While at it, improve the test coverage to include negative interval inputs. Discussion: https://postgr.es/m/CAEZATCXoUKHkcuq4q63hkiPsKZJd0kZWzgKtU%2BNT0aU4wbf_Pw%40mail.gmail.com
* Stop including parsenodes.h in plannodes.hAlvaro Herrera2023-11-07
| | | | | | | | | | | | | | | | | | | | I added it by mistake in commit 7103ebb7aae8. To clean up, struct MergeAction needs to be moved to primnodes.h from parsenodes.h. (This forces us to also move OverridingKind to primnodes.h). Having to add parsenodes.h to bootstrap.h as fallout is a bit surprising, since nothing nominally needs it there. However, per comments in bootscanner.l, it is needed so that YYSTYPE can be declared. I think this only started with commit dac048f71ebb, but I didn't actually verify that. In passing, stop including parsenodes.h in tcopprot.h. Nothing needs it there. Per discussion on a patch by Ashutosh Bapat. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202311071106.6y7b2ascqjlz@alvherre.pgsql
* Detect integer overflow while computing new array dimensions.Tom Lane2023-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | array_set_element() and related functions allow an array to be enlarged by assigning to subscripts outside the current array bounds. While these places were careful to check that the new bounds are allowable, they neglected to consider the risk of integer overflow in computing the new bounds. In edge cases, we could compute new bounds that are invalid but get past the subsequent checks, allowing bad things to happen. Memory stomps that are potentially exploitable for arbitrary code execution are possible, and so is disclosure of server memory. To fix, perform the hazardous computations using overflow-detecting arithmetic routines, which fortunately exist in all still-supported branches. The test cases added for this generate (after patching) errors that mention the value of MaxArraySize, which is platform-dependent. Rather than introduce multiple expected-files, use psql's VERBOSITY parameter to suppress the printing of the message text. v11 psql lacks that parameter, so omit the tests in that branch. Our thanks to Pedro Gallegos for reporting this problem. Security: CVE-2023-5869
* Remove distprepPeter Eisentraut2023-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A PostgreSQL release tarball contains a number of prebuilt files, in particular files produced by bison, flex, perl, and well as html and man documentation. We have done this consistent with established practice at the time to not require these tools for building from a tarball. Some of these tools were hard to get, or get the right version of, from time to time, and shipping the prebuilt output was a convenience to users. Now this has at least two problems: One, we have to make the build system(s) work in two modes: Building from a git checkout and building from a tarball. This is pretty complicated, but it works so far for autoconf/make. It does not currently work for meson; you can currently only build with meson from a git checkout. Making meson builds work from a tarball seems very difficult or impossible. One particular problem is that since meson requires a separate build directory, we cannot make the build update files like gram.h in the source tree. So if you were to build from a tarball and update gram.y, you will have a gram.h in the source tree and one in the build tree, but the way things work is that the compiler will always use the one in the source tree. So you cannot, for example, make any gram.y changes when building from a tarball. This seems impossible to fix in a non-horrible way. Second, there is increased interest nowadays in precisely tracking the origin of software. We can reasonably track contributions into the git tree, and users can reasonably track the path from a tarball to packages and downloads and installs. But what happens between the git tree and the tarball is obscure and in some cases non-reproducible. The solution for both of these issues is to get rid of the step that adds prebuilt files to the tarball. The tarball now only contains what is in the git tree (*). Getting the additional build dependencies is no longer a problem nowadays, and the complications to keep these dual build modes working are significant. And of course we want to get the meson build system working universally. This commit removes the make distprep target altogether. The make dist target continues to do its job, it just doesn't call distprep anymore. (*) - The tarball also contains the INSTALL file that is built at make dist time, but not by distprep. This is unchanged for now. The make maintainer-clean target, whose job it is to remove the prebuilt files in addition to what make distclean does, is now just an alias to make distprep. (In practice, it is probably obsolete given that git clean is available.) The following programs are now hard build requirements in configure (they were already required by meson.build): - bison - flex - perl Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
* Add XMLText function (SQL/XML X038)Daniel Gustafsson2023-11-06
| | | | | | | | | | | | | | | | | | | This function implements the standard XMLTest function, which converts text into xml text nodes. It uses the libxml2 function xmlEncodeSpecialChars to escape predefined entities (&"<>), so that those do not cause any conflict when concatenating the text node output with existing xml documents. This also adds a note in features.sgml about not supporting XML(SEQUENCE). The SQL specification defines a RETURNING clause to a set of XML functions, where RETURNING CONTENT or RETURNING SEQUENCE can be defined. Since PostgreSQL doesn't support XML(SEQUENCE) all of these functions operate with an implicit RETURNING CONTENT. Author: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Discussion: https://postgr.es/m/86617a66-ec95-581f-8d54-08059cca8885@uni-muenster.de
* Additional unicode primitive functions.Jeff Davis2023-11-01
| | | | | | | | | | | | Introduce unicode_version(), icu_unicode_version(), and unicode_assigned(). The latter requires introducing a new lookup table for the Unicode General Category, which is generated along with the other Unicode lookup tables. Discussion: https://postgr.es/m/CA+TgmoYzYR-yhU6k1XFCADeyj=Oyz2PkVsa3iKv+keM8wp-F_A@mail.gmail.com Reviewed-by: Peter Eisentraut
* Fix function name in commentDaniel Gustafsson2023-11-01
| | | | | | | | | | | The name of the function resulting from the macro expansion was incorrectly stated. Backpatch to 16 where it was introduced. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20231101.172308.1740861597185391383.horikyota.ntt@gmail.com Backpatch-through: v16
* doc: 1-byte varlena headers can be used for user PLAIN storageBruce Momjian2023-10-31
| | | | | | | | | | | | This also updates some C comments. Reported-by: suchithjn22@gmail.com Discussion: https://postgr.es/m/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org Author: Laurenz Albe (doc patch) Backpatch-through: 11
* Introduce pg_stat_checkpointerMichael Paquier2023-10-30
| | | | | | | | | | | | | | | | | | | | | | | | Historically, the statistics of the checkpointer have been always part of pg_stat_bgwriter. This commit removes a few columns from pg_stat_bgwriter, and introduces pg_stat_checkpointer with equivalent, renamed columns (plus a new one for the reset timestamp): - checkpoints_timed -> num_timed - checkpoints_req -> num_requested - checkpoint_write_time -> write_time - checkpoint_sync_time -> sync_time - buffers_checkpoint -> buffers_written The fields of PgStat_CheckpointerStats and its SQL functions are renamed to match with the new field names, for consistency. Note that background writer and checkpointer have been split into two different processes in commits 806a2aee3791 and bf405ba8e460. The pgstat structures were already split, making this change straight-forward. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Andres Freund, Michael Paquier Discussion: https://postgr.es/m/CALj2ACVxX2ii=66RypXRweZe2EsBRiPMj0aHfRfHUeXJcC7kHg@mail.gmail.com
* Refactor some code related to transaction-level statistics for relationsMichael Paquier2023-10-30
| | | | | | | | | | | | | | | | | | | | This commit refactors find_tabstat_entry() so as transaction counters for inserted, updated and deleted tuples are included in the result returned. If a shared entry is found for a relation, its result is now a copy of the PgStat_TableStatus entry retrieved from shared memory. This idea has been proposed by Andres Freund. While on it, the following SQL functions, used in system views, are refactored with macros, in the same spirit as 83a1a1b56645, reducing the amount of code: - pg_stat_get_xact_tuples_deleted() - pg_stat_get_xact_tuples_inserted() - pg_stat_get_xact_tuples_updated() There is now only one caller of find_tabstat_entry() in the tree. Author: Bertrand Drouvot Discussion: https://postgr.es/m/b9e1f543-ee93-8168-d530-d961708ad9d3@gmail.com
* Guard against overflow in make_interval().Dean Rasheed2023-10-29
| | | | | | | | | | The original code did very little to guard against integer or floating point overflow when computing the interval's fields. Detect any such overflows and error out, rather than silently returning bogus results. Joseph Koshakow, reviewed by Ashutosh Bapat and me. Discussion: https://postgr.es/m/CAAvxfHcm1TPwH_zaGWuFoL8pZBestbRZTU6Z%3D-RvAdSXTPbKfg%40mail.gmail.com
* Remove buffers_backend and buffers_backend_fsync from pg_stat_checkpointerMichael Paquier2023-10-27
| | | | | | | | | | | | | | | | | | | | | | | Two attributes related to checkpointer statistics are removed in this commit: - buffers_backend, that counts the number of buffers written directly by a backend. - buffers_backend_fsync, that counts the number of times a backend had to do fsync() by its own. These are actually not checkpointer properties but backend properties. Also, pg_stat_io provides a more accurate and equivalent report of these numbers, by tracking all the I/O stats related to backends, including writes and fsyncs, so storing them in pg_stat_checkpointer was redundant. Thanks also to Robert Haas and Amit Kapila for their input. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Andres Freund Discussion: https://postgr.es/m/20230210004604.mcszbscsqs3bc5nx@awork3.anarazel.de
* Optimize various aggregate deserialization functions, take 2David Rowley2023-10-27
| | | | | | | | | | | | | | | | | | | | f0efa5aec added initReadOnlyStringInfo to allow a StringInfo to be initialized from an existing buffer and also relaxed the requirement that a StringInfo's buffer must be NUL terminated at data[len]. Now that we have that, there's no need for these aggregate deserial functions to use appendBinaryStringInfo() as that rather wastefully palloc'd a new buffer and memcpy'd in the bytea's buffer. Instead, we can just use the bytea's buffer and point the StringInfo directly to that using the new initializer function. In Amdahl's law, this speeds up the serial portion of parallel aggregates and makes sum(numeric), avg(numeric), var_pop(numeric), var_samp(numeric), variance(numeric), stddev_pop(numeric), stddev_samp(numeric), stddev(numeric), array_agg(anyarray), string_agg(text) and string_agg(bytea) scale better in parallel queries. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvr%3De-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR%3DcQ%40mail.gmail.com
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Introduce the concept of read-only StringInfosDavid Rowley2023-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were various places in our codebase which conjured up a StringInfo by manually assigning the StringInfo fields and setting the data field to point to some existing buffer. There wasn't much consistency here as to what fields like maxlen got set to and in one location we didn't correctly ensure that the buffer was correctly NUL terminated at len bytes, as per what was documented as required in stringinfo.h Here we introduce 2 new functions to initialize StringInfos. One allows callers to initialize a StringInfo passing along a buffer that is already allocated by palloc. Here the StringInfo code uses this buffer directly rather than doing any memcpying into a new allocation. Having this as a function allows us to verify the buffer is correctly NUL terminated. StringInfos initialized this way can be appended to and reset just like any other normal StringInfo. The other new initialization function also accepts an existing buffer, but the given buffer does not need to be a pointer to a palloc'd chunk. This buffer could be a pointer pointing partway into some palloc'd chunk or may not even be palloc'd at all. StringInfos initialized this way are deemed as "read-only". This means that it's not possible to append to them or reset them. For the latter of the two new initialization functions mentioned above, we relax the requirement that the data buffer must be NUL terminated. Relaxing this requirement is convenient in a few places as it can save us from having to allocate an entire new buffer just to add the NUL terminator or save us from having to temporarily add a NUL only to have to put the original char back again later. Incompatibility note: Here we also forego adding the NUL in a few places where it does not seem to be required. These locations are passing the given StringInfo into a type's receive function. It does not seem like any of our built-in receive functions require this, but perhaps there's some UDT out there in the wild which does require this. It is likely worthy of a mention in the release notes that a UDT's receive function mustn't rely on the input StringInfo being NUL terminated. Author: David Rowley Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvorfO3iBZ%3DxpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ%40mail.gmail.com
* Migrate logical slots to the new node during an upgrade.Amit Kapila2023-10-26
| | | | | | | | | | | | | | | | | | | | | | | | While reading information from the old cluster, a list of logical slots is fetched. At the later part of upgrading, pg_upgrade revisits the list and restores slots by executing pg_create_logical_replication_slot() on the new cluster. Migration of logical replication slots is only supported when the old cluster is version 17.0 or later. If the old node has invalid slots or slots with unconsumed WAL records, the pg_upgrade fails. These checks are needed to prevent data loss. The significant advantage of this commit is that it makes it easy to continue logical replication even after upgrading the publisher node. Previously, pg_upgrade allowed copying publications to a new node. With this patch, adjusting the connection string to the new publisher will cause the apply worker on the subscriber to connect to the new publisher automatically. This enables seamless continuation of logical replication, even after an upgrade. Author: Hayato Kuroda, Hou Zhijie Reviewed-by: Peter Smith, Bharath Rupireddy, Dilip Kumar, Vignesh C, Shlok Kyal Discussion: http://postgr.es/m/TYAPR01MB58664C81887B3AF2EB6B16E3F5939@TYAPR01MB5866.jpnprd01.prod.outlook.com Discussion: http://postgr.es/m/CAA4eK1+t7xYcfa0rEQw839=b2MzsfvYDPz3xbD+ZqOdP3zpKYg@mail.gmail.com
* Doc: modernize comment for boolin().Tom Lane2023-10-19
| | | | | | | | | | Most of the behavior described by this comment was moved to parse_bool_with_len() some time ago. Move what's still valuable there too, and drop the rest. Peter Smith Discussion: https://postgr.es/m/CAHut+PtMJURKp=U8Z=Ktp0zV40sEb1f-iEk9FvY2GQe+5ZBnwg@mail.gmail.com
* Dodge a compiler bug affecting timetz_zone/timetz_izone.Tom Lane2023-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | Use a modulo operator instead of implementing the same behavior with a loop. The loop solution is doubtless microscopically faster for the typical case of only wrapping into the very next day, but maybe not so much for large interval values. In any case, timetz is such a backwater that it's doubtful anybody would notice any performance change anyway. This avoids a compiler bug occurring in AIX's xlc, even in pretty late-model revisions. We did not have test coverage for the case where the initial result->time value is negative, so add that. For the moment, install this only in HEAD. My plan is to back-patch the test case, and then the code change assuming that buildfarm testing proves the bug occurs in the back branches. (That seems pretty likely, but let's find out for sure.) Per buildfarm results from commits 97957fdba and 2f0472030. Thanks to Michael Paquier for the idea to use a modulo operation to replace the faulty loop. Discussion: https://postgr.es/m/CA+hUKGK=DOC+hE-62FKfZy=Ybt5uLkrg3zCZD-jFykM-iPn8yw@mail.gmail.com
* Harden has_xxx_privilege() functions against concurrent object drops.Tom Lane2023-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The versions of these functions that accept object OIDs are supposed to return NULL, rather than failing, if the target object has been dropped. This makes it safe(r) to use them in queries that scan catalogs, since the functions will be applied to objects that are visible in the query's snapshot but might now be gone according to the catalog snapshot. In most cases we implemented this by doing a SearchSysCacheExists test and assuming that if that succeeds, we can safely invoke the appropriate aclchk.c function, which will immediately re-fetch the same syscache entry. It was argued that if the existence test succeeds then the followup fetch must succeed as well, for lack of any intervening AcceptInvalidationMessages call. Alexander Lakhin demonstrated that this is not so when CATCACHE_FORCE_RELEASE is enabled: the syscache entry will be forcibly dropped at the end of SearchSysCacheExists, and then it is possible for the catalog snapshot to get advanced while re-fetching the entry. Alexander's test case requires the operation to happen inside a parallel worker, but that seems incidental to the fundamental problem. What remains obscure is whether there is a way for this to happen in a non-debug build. Nonetheless, CATCACHE_FORCE_RELEASE is a very useful test methodology, so we'd better make the code safe for it. After some discussion we concluded that the most future-proof fix is to give up the assumption that checking SearchSysCacheExists can guarantee success of a later fetch. At best that assumption leads to fragile code --- for example, has_type_privilege appears broken for array types even if you believe the assumption holds. And it's not even particularly efficient. There had already been some work towards extending the aclchk.c APIs to include "is_missing" output flags, so this patch extends that work to cover all the aclchk.c functions that are used by the has_xxx_privilege() functions. (This allows getting rid of some ad-hoc decisions about not throwing errors in certain places in aclchk.c.) In passing, this fixes the has_sequence_privilege() functions to provide the same guarantees as their cousins: for some reason the SearchSysCacheExists tests never got added to those. There is more work to do to remove the unsafe coding pattern with SearchSysCacheExists in other places, but this is a pretty self-contained patch so I'll commit it separately. Per bug #18014 from Alexander Lakhin. Given the lack of hard evidence that there's a bug in non-debug builds, I'm content to fix this only in HEAD. (Perhaps we should clean up the has_sequence_privilege() oversight in the back branches, but in the absence of field complaints I'm not too excited about that either.) Discussion: https://postgr.es/m/18014-28c81cb79d44295d@postgresql.org
* Add support for AT LOCALMichael Paquier2023-10-13
| | | | | | | | | | | | | | | When converting a timestamp to/from with/without time zone, the SQL Standard specifies an AT LOCAL variant of AT TIME ZONE which uses the session's time zone. This includes three system functions able to do the work in the same way as the existing flavors for AT TIME ZONE, except that these need to be marked as stable as they depend on the session's TimeZone GUC. Bump catalog version. Author: Vik Fearing Reviewed-by: Laurenz Albe, Cary Huang, Michael Paquier Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org
* Add const to values and nulls argumentsPeter Eisentraut2023-10-10
| | | | | | | This excludes any changes that would change the external AM APIs. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/14c31f4a-0347-0805-dce8-93a9072c05a5%40eisentraut.org
* Revert "Optimize various aggregate deserialization functions"David Rowley2023-10-10
| | | | | | | | | | This reverts commit 608fd198def5390c3490bfe903730207dfd8eeb4. On 2nd thoughts, the StringInfo API requires that strings are NUL terminated and pointing directly to the data in a bytea Datum isn't NUL terminated. Discussion: https://postgr.es/m/CAApHDvorfO3iBZ=xpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ@mail.gmail.com
* Optimize various aggregate deserialization functionsDavid Rowley2023-10-09
| | | | | | | | | | | | | | | The serialized representation of an internal aggregate state is a bytea value. In each deserial function, in order to "receive" the bytea value we appended it onto a short-lived StringInfoData using appendBinaryStringInfo. This was a little wasteful as it meant having to palloc memory, copy a (possibly long) series of bytes then later pfree that memory. Instead of going to this extra trouble, we can just fake up a StringInfoData and point the data directly at the bytea's payload. This should help increase the performance of internal aggregate deserialization. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAApHDvr=e-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR=cQ@mail.gmail.com
* Improve JsonLexContext's freeabilityAlvaro Herrera2023-10-05
| | | | | | | | | | | | | | | | | | | | | | | Previously, the JSON code didn't have to worry too much about freeing JsonLexContext, because it was never too long-lived. With new features being added for SQL/JSON this is no longer the case. Add a routine that knows how to free this struct and apply that to a few places, to prevent this from becoming problematic. At the same time, we change the API of makeJsonLexContextCstringLen to make it receive a pointer to JsonLexContext for callers that want it to be stack-allocated; it can also be passed as NULL to get the original behavior of a palloc'ed one. This also causes an ABI break due to the addition of flags to JsonLexContext, so we can't easily backpatch it. AFAICS that's not much of a problem; apparently some leaks might exist in JSON usage of text-search, for example via json_to_tsvector, but I haven't seen any complaints about that. Per Coverity complaint about datum_to_jsonb_internal(). Discussion: https://postgr.es/m/20230808174110.oq3iymllsv6amkih@alvherre.pgsql
* Tidy-up some appendStringInfo*() usagesDavid Rowley2023-10-03
| | | | | | | | | | | | Make a few newish calls to appendStringInfo() which have no special formatting use appendStringInfoString() instead. Also, adjust usages of appendStringInfoString() which only append a string containing a single character to make use of appendStringInfoChar() instead. This makes the code marginally faster, but primarily this change is so we use the StringInfo type as it was intended to be used. Discussion: https://postgr.es/m/CAApHDvpXKQmL+r=VDNS98upqhr9yGBhv2Jw3GBFFk_wKHcB39A@mail.gmail.com
* Fix datalen calculation in tsvectorrecv().Tom Lane2023-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After receiving position data for a lexeme, tsvectorrecv() advanced its "datalen" value by (npos+1)*sizeof(WordEntry) where the correct calculation is (npos+1)*sizeof(WordEntryPos). This accidentally failed to render the constructed tsvector invalid, but it did result in leaving some wasted space approximately equal to the space consumed by the position data. That could have several bad effects: * Disk space is wasted if the received tsvector is stored into a table as-is. * A legal tsvector could get rejected with "maximum total lexeme length exceeded" if the extra space pushes it over the MAXSTRPOS limit. * In edge cases, the finished tsvector could be assigned a length larger than the allocated size of its palloc chunk, conceivably leading to SIGSEGV when the tsvector gets copied somewhere else. The odds of a field failure of this sort seem low, though valgrind testing could probably have found this. While we're here, let's express the calculation as "sizeof(uint16) + npos * sizeof(WordEntryPos)" to avoid the type pun implicit in the "npos + 1" formulation. It's not wrong given that WordEntryPos had better be 2 bytes to avoid padding problems, but it seems clearer this way. Report and patch by Denis Erokhin. Back-patch to all supported versions. Discussion: https://postgr.es/m/009801d9f2d9$f29730c0$d7c59240$@datagile.ru
* Add some const qualifiersPeter Eisentraut2023-09-26
| | | | | | | | | | | There was a mismatch between the const qualifiers for excludeDirContents in src/backend/backup/basebackup.c and src/bin/pg_rewind/filemap.c, which led to a quick search for similar cases. We should make excludeDirContents match, but the rest of the changes seem like a good idea as well. Author: David Steele <david@pgmasters.net> Discussion: https://www.postgresql.org/message-id/flat/669a035c-d23d-2f38-7ff0-0cb93e01d610@pgmasters.net
* Fix typo in numutils.c commentsDaniel Gustafsson2023-09-25
| | | | s/messges/messages/
* Replace more MemSet calls with struct initializationPeter Eisentraut2023-09-19
| | | | | | | This fixes up 10ea0f924a2 to use the style introduced by 9fd45870c1. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs490gJf5A=ydqyjh+Z8mVQa_foTGtcmBtHGLra0aOwLWHQ@mail.gmail.com
* Don't crash if cursor_to_xmlschema is used on a non-data-returning Portal.Tom Lane2023-09-18
| | | | | | | | | | | | | | | | cursor_to_xmlschema() assumed that any Portal must have a tupDesc, which is not so. Add a defensive check. It's plausible that this mistake occurred because of the rather poorly chosen name of the lookup function SPI_cursor_find(), which in such cases is returning something that isn't very much like a cursor. Add some documentation to try to forestall future errors of the same ilk. Report and patch by Boyu Yang (docs changes by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/dd343010-c637-434c-a8cb-418f53bda3b8.yangboyu.yby@alibaba-inc.com
* Track nesting depth correctly when drilling down into RECORD Vars.Tom Lane2023-09-15
| | | | | | | | | | | | | | | | | | expandRecordVariable() failed to adjust the parse nesting structure correctly when recursing to inspect an outer-level Var. This could result in assertion failures or core dumps in corner cases. Likewise, get_name_for_var_field() failed to adjust the deparse namespace stack correctly when recursing to inspect an outer-level Var. In this case the likely result was a "bogus varno" error while deparsing a view. Per bug #18077 from Jingzhou Fu. Back-patch to all supported branches. Richard Guo, with some adjustments by me Discussion: https://postgr.es/m/18077-b9db97c6e0ab45d8@postgresql.org
* Refactor error messages for unsupported providers in pg_locale.cMichael Paquier2023-09-14
| | | | | | | | | | | | | | | | | | These code paths should not be reached normally, but if they are an error with "(null)" as information for the collation provider would show up if no locale is set, while we can assume that we are referring to libc. This refactors the code so as the provider is always reported even if no locale is set. The name of the function where the error happens is added, while on it, as it can be helpful for debugging. Issue introduced by d87d548cd030, so backpatch down to 16. Author: Michael Paquier, Ranier Vilela Reviewed-by: Jeff Davis, Kyotaro Horiguchi Discussion: https://postgr.es/m/7073610042fcf97e1bea2ce08b7e0214b5e11094.camel@j-davis.com Backpatch-through: 16
* Improve description of keys in tsvectorMichael Paquier2023-09-05
| | | | | | | | | | | | | | | | | If all the bits of a key in a tsvector are true (marked with ALLISTRUE), gtsvectorout() would show the following description: "0 true bits, 0 false bits" This is confusing, as all the bits are true, but this would be equivalent to the information if siglen is 0. This commit improves the output so as "all true bits" show instead in this case. Alexander has proposed a regression test for pageinspect, not included here as it is rather expensive compared to its coverage value. Author: Alexander Lakhin Discussion: https://postgr.es/m/17950-6c80a8d2b94ec695@postgresql.org
* Fix out-of-bound read in gtsvector_picksplit()Michael Paquier2023-09-04
| | | | | | | | | | | | | | This could lead to an imprecise choice when splitting an index page of a GiST index on a tsvector, deciding which entries should remain on the old page and which entries should move to a new page. This is wrong since tsearch2 has been moved into core with commit 140d4ebcb46e, so backpatch all the way down. This error has been spotted by valgrind. Author: Alexander Lakhin Discussion: https://postgr.es/m/17950-6c80a8d2b94ec695@postgresql.org Backpatch-through: 11
* Use actual backend IDs in pg_stat_get_backend_subxact().Nathan Bossart2023-08-30
| | | | | | | | | | | | | | | | | | | | | | Unlike the other pg_stat_get_backend* functions, pg_stat_get_backend_subxact() looks up the backend entry by using its integer argument as a 1-based index in an internal array. The other functions look for the entry with the matching session backend ID. These numbers often match, but that isn't reliably true. This commit resolves this discrepancy by introducing pgstat_get_local_beentry_by_backend_id() and using it in pg_stat_get_backend_subxact(). We cannot use pgstat_get_beentry_by_backend_id() because it returns a PgBackendStatus, which lacks the locally computed additions available in LocalPgBackendStatus that are required by pg_stat_get_backend_subxact(). Author: Ian Barwick Reviewed-by: Sami Imseih, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/CAB8KJ%3Dj-ACb3H4L9a_b3ZG3iCYDW5aEu3WsPAzkm2S7JzS1Few%40mail.gmail.com Backpatch-through: 16
* Rename some support functions for pgstat* views.Nathan Bossart2023-08-30
| | | | | | | | | | | | | | | | | | | | | Presently, pgstat_fetch_stat_beentry() accepts a session's backend ID as its argument, and pgstat_fetch_stat_local_beentry() accepts a 1-based index in an internal array as its argument. The former is typically used wherever a user must provide a backend ID, and the latter is usually used internally when looping over all entries in the array. This difference was first introduced by d7e39d72ca. Before that commit, both functions accepted a 1-based index to the internal array. This commit renames these two functions to make it clear whether they use the backend ID or the 1-based index to look up the entry. This is preparatory work for a follow-up change that will introduce a function for looking up a LocalPgBackendStatus using a backend ID. Reviewed-by: Ian Barwick, Sami Imseih, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/CAB8KJ%3Dj-ACb3H4L9a_b3ZG3iCYDW5aEu3WsPAzkm2S7JzS1Few%40mail.gmail.com Backpatch-through: 16
* Tighten unit parsing in internal valuesMichael Paquier2023-08-28
| | | | | | | | | | | | | | Interval values now generate an error when the user has multiple consecutive units or a unit without a value. Previously, it was possible to specify multiple units consecutively which is contrary to what the documentation allows, so it was possible to finish with confusing interval values. This is a follow-up of the work done in 165d581f146b. Author: Joseph Koshakow Reviewed-by: Jacob Champion, Gurjeet Singh, Reid Thompson Discussion: https://postgr.es/m/CAAvxfHd-yNO+XYnUxL=GaNZ1n+eE0V-oE0+-cC1jdjdU0KS3iw@mail.gmail.com
* Tighten handling of "ago" in interval valuesMichael Paquier2023-08-28
| | | | | | | | | | | | | | This commit Restrict the unit "ago" to only appear at the end of the interval. According to the documentation, a direction can only be defined at the end of an interval, but it was possible to define it in the middle of the string or define it multiple times. In spirit, this is similar to the error handling improvements done in 5b3c5953553b or bcc704b524904. Author: Joseph Koshakow Reviewed-by: Jacob Champion, Gurjeet Singh, Reid Thompson Discussion: https://postgr.es/m/CAAvxfHd-yNO+XYnUxL=GaNZ1n+eE0V-oE0+-cC1jdjdU0KS3iw@mail.gmail.com
* Remove dead code in DecodeInterval()Michael Paquier2023-08-28
| | | | | | | | | | | | This commit removes some dead code related to the unit type RESERV, whose last use has been removed from the unit lookup table used for intervals ("deltatktbl" in datetime.c) in 666cbae16da4. Before that, RESERV was used as an equivalent of "invalid", but that's now unreachable. Author: Joseph Koshakow Reviewed-by: Jacob Champion, Gurjeet Singh, Reid Thompson Discussion: https://postgr.es/m/CAAvxfHd-yNO+XYnUxL=GaNZ1n+eE0V-oE0+-cC1jdjdU0KS3iw@mail.gmail.com
* Catalog not-null constraintsAlvaro Herrera2023-08-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now create contype='n' pg_constraint rows for not-null constraints. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. We also spawn not-null constraints for inheritance child tables when their parents have primary keys. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter it, instead matching by column name that they apply to. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them for system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) psql shows these constraints in \d+. pg_dump requires some ad-hoc hacks, particularly when dumping a primary key. We now create one "throwaway" not-null constraint for each column in the PK together with the CREATE TABLE command, and once the PK is created, all those throwaway constraints are removed. This avoids having to check each tuple for nullness when the dump restores the primary key creation. pg_upgrading from an older release requires a somewhat brittle procedure to create a constraint state that matches what would be created if the database were being created fresh in Postgres 17. I have tested all the scenarios I could think of, and it works correctly as far as I can tell, but I could have neglected weird cases. This patch has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. During Postgres 16 this had already been introduced by commit e056c557aef4, but there were some problems mainly with the pg_upgrade procedure that couldn't be fixed in reasonable time, so it was reverted. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring an additional pg_attribute column to track the OID of the not-null constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>