aboutsummaryrefslogtreecommitdiff
path: root/src/include/common
Commit message (Collapse)AuthorAge
* Use 'void *' for arbitrary buffers, 'uint8 *' for byte arraysHeikki Linnakangas11 days
| | | | | | | | | | | | | A 'void *' argument suggests that the caller might pass an arbitrary struct, which is appropriate for functions like libc's read/write, or pq_sendbytes(). 'uint8 *' is more appropriate for byte arrays that have no structure, like the cancellation keys or SCRAM tokens. Some places used 'char *', but 'uint8 *' is better because 'char *' is commonly used for null-terminated strings. Change code around SCRAM, MD5 authentication, and cancellation key handling to follow these conventions. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org
* Update Unicode data to Unicode 16.0.0Peter Eisentraut2025-04-03
| | | | | Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/146349e4-4687-4321-91af-f235572490a8@eisentraut.org
* pg_upgrade: Add --swap for faster file transfer.Nathan Bossart2025-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new option instructs pg_upgrade to move the data directories from the old cluster to the new cluster and then to replace the catalog files with those generated for the new cluster. This mode can outperform --link, --clone, --copy, and --copy-file-range, especially on clusters with many relations. However, this mode creates many garbage files in the old cluster, which can prolong the file synchronization step if --sync-method=syncfs is used. To handle that, we recommend using --sync-method=fsync with this mode, and pg_upgrade internally uses "initdb --sync-only --no-sync-data-files" for file synchronization. pg_upgrade will synchronize the catalog files as they are transferred. We assume that the database files transferred from the old cluster were synchronized prior to upgrade. This mode also complicates reverting to the old cluster, so we recommend restoring from backup upon failure during or after file transfer. We did consider teaching pg_upgrade how to generate a revert script for such failures, but we decided against it due to the rarity of failing during file transfer, the complexity of generating the script, and the potential for misusing the script. The new mode is limited to clusters located in the same file system. With some effort, we could probably support upgrades between different file systems, but this mode is unlikely to offer much benefit if we have to copy the files across file system boundaries. It is also limited to upgrades from version 10 or newer. There are a few known obstacles for using swap mode to upgrade from older versions. For example, the visibility map format changed in v9.6, and the sequence tuple format changed in v10. In fact, swap mode omits the --sequence-data option in its uses of pg_dump and instead reuses the old cluster's sequence data files. While teaching swap mode to deal with these kinds of changes is surely possible (and we may have to deal with similar problems in the future, anyway), it doesn't seem worth the effort to support upgrades from long-unsupported versions. Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* initdb: Add --no-sync-data-files.Nathan Bossart2025-03-25
| | | | | | | | | | | | | | | | | | | | | | This new option instructs initdb to skip synchronizing any files in database directories, the database directories themselves, and the tablespace directories, i.e., everything in the base/ subdirectory and any other tablespace directories. Other files, such as those in pg_wal/ and pg_xact/, will still be synchronized unless --no-sync is also specified. --no-sync-data-files is primarily intended for internal use by tools that separately ensure the skipped files are synchronized to disk. A follow-up commit will use this to help optimize pg_upgrade's file transfer step. The --sync-method=fsync implementation of this option makes use of a new exclude_dir parameter for walkdir(). When not NULL, exclude_dir specifies a directory to skip processing. The --sync-method=syncfs implementation of this option just skips synchronizing the non-default tablespace directories. This means that initdb will still synchronize some or all of the database files, but there's not much we can do about that. Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* Fix headerscheck warning.Jeff Davis2025-03-18
| | | | | Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/93731.1742310701@sss.pgh.pa.us
* Optimization for lower(), upper(), casefold() functions.Jeff Davis2025-03-15
| | | | | | | | | | | | | | | | | | | | | | | Improve performance and reduce table sizes for case mapping. The main case mapping table stores only 16-bit offsets, which can be used to look up the mapped code point in any of the case tables (fold, lower, upper, or title case). Simple case pairs point to the same offsets. Generate a function in generate-unicode_case_table.pl that consists of a nested branches to test for specific codepoint ranges that determine the offset in the main table. Other approaches were considered, such as representing these ranges as another structure (rather than branches in a generated function), or a different approach such as a radix tree, or perfect hashing. The author implemented and tested these alternatives and settled on the generated branches. Author: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/7cac7e66-9a3b-4e3f-a997-42aa0c401f80%40gmail.com
* Swap order of extern/static and pg_nodiscardPeter Eisentraut2025-03-14
| | | | | | | | | | | | | | | | | | When pg_nodiscard was first added, the C standard draft had it as a function specifier, and so the code comment about placement was written with that in mind. The final C23 standard has it as an attribute and the placement rules are a bit different for that. Specifically, it needs to be before extern or static. (Or at least both current clang and gcc require that.) So just swap these. (To be clear: The current implementation with gcc attributes doesn't care. This change is just for maximum forward compatibility for non-gcc compilers.) This also keeps the order consistent with the previously introduced pg_noreturn. Also update the code comment to reflect the mentioned developments since its introduction. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
* pg_noreturn to replace pg_attribute_noreturn()Peter Eisentraut2025-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
* Change relpath() et al to return path by valueAndres Freund2025-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | For AIO, and also some other recent patches, we need the ability to call relpath() in a critical section. Until now that was not feasible, as it allocated memory. The fact that relpath() allocated memory also made it awkward to use in log messages because we had to take care to free the memory afterwards. Which we e.g. didn't do for when zeroing out an invalid buffer. We discussed other solutions, e.g. filling a pre-allocated buffer that's passed to relpath(), but they all came with plenty downsides or were larger projects. The easiest fix seems to be to make relpath() return the path by value. To be able to return the path by value we need to determine the maximum length of a relation path. This patch adds a long #define that computes the exact maximum, which is verified to be correct in a regression test. As this change the signature of relpath(), extensions using it will need to adapt their code. We discussed leaving a backward-compat shim in place, but decided it's not worth it given the use of relpath() doesn't seem widespread. Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra
* Silence warning in older versions of ValgrindJohn Naylor2025-02-24
| | | | | | | | | | | | | Due to misunderstanding on my part, commit 235328ee4 did not go far enough to silence older versions of Valgrind. For those, it was the bit scan that was problematic, not the subsequent bit-masking operation. To fix, use the unaligned path for the trailing bytes. Since we don't have a bit scan here anymore, also remove some comments and endian-specific coding around that. Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru> Discussion: https://postgr.es/m/f3aa2d45-3b28-41c5-9499-a1bc30e0f8ec@postgrespro.ru Backpatch-through: 17
* Add support for OAUTHBEARER SASL mechanismDaniel Gustafsson2025-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit implements OAUTHBEARER, RFC 7628, and OAuth 2.0 Device Authorization Grants, RFC 8628. In order to use this there is a new pg_hba auth method called oauth. When speaking to a OAuth- enabled server, it looks a bit like this: $ psql 'host=example.org oauth_issuer=... oauth_client_id=...' Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG Device authorization is currently the only supported flow so the OAuth issuer must support that in order for users to authenticate. Third-party clients may however extend this and provide their own flows. The built-in device authorization flow is currently not supported on Windows. In order for validation to happen server side a new framework for plugging in OAuth validation modules is added. As validation is implementation specific, with no default specified in the standard, PostgreSQL does not ship with one built-in. Each pg_hba entry can specify a specific validator or be left blank for the validator installed as default. This adds a requirement on libcurl for the client side support, which is optional to build, but the server side has no additional build requirements. In order to run the tests, Python is required as this adds a https server written in Python. Tests are gated behind PG_TEST_EXTRA as they open ports. This patch has been a multi-year project with many contributors involved with reviews and in-depth discussions: Michael Paquier, Heikki Linnakangas, Zhihong Yu, Mahendrakar Srinivasarao, Andrey Chudnovsky and Stephen Frost to name a few. While Jacob Champion is the main author there have been some levels of hacking by others. Daniel Gustafsson contributed the validation module and various bits and pieces; Thomas Munro wrote the client side support for kqueue. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Reviewed-by: Kashif Zeeshan <kashi.zeeshan@gmail.com> Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
* Show more-intuitive titles for psql commands \dt, \di, etc.Tom Lane2025-02-05
| | | | | | | | | | | | | | | | | | | | | If exactly one relation type is requested in a command of the \dtisv family, say "tables", "indexes", etc instead of "relations". This should cover the majority of actual uses, without creating a huge number of new translatable strings. The error messages for no matching relations are adjusted as well. In passing, invent "pg_log_error_internal()" to be used for frontend error messages that don't seem to need translation, analogously to errmsg_internal() in the backend. The implementation is a bit cheesy, being just a macro to prevent xgettext from recognizing a trigger keyword. This won't avoid a useless gettext lookup cycle at runtime --- but surely we don't care about an extra microsecond or two in what's supposed to be a can't-happen case. I (tgl) also made "pg_fatal_internal()", though it's not used in this patch. Author: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKAnmm+7o93fQV-RFkGaN1QnP-0D4d3JTykD+cLueqjDMKdfag@mail.gmail.com
* Revert "Speed up tail processing when hashing aligned C strings, take two"John Naylor2025-01-29
| | | | | | | | | | | This reverts commit a365d9e2e8c1ead27203a4431211098292777d3b. Older versions of Valgrind raise an error, so go back to the bytewise loop for the final word in the input. Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru> Discussion: https://postgr.es/m/a3a959f6-14b8-4819-ac04-eaf2aa2e868d@postgrespro.ru Backpatch-through: 17
* Add support for Unicode case folding.Jeff Davis2025-01-23
| | | | | | | Expand case mapping tables to include entries for case folding, which are parsed from CaseFolding.txt. Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com
* Be clearer about when jsonapi's need_escapes is neededAndrew Dunstan2025-01-19
| | | | | | | | | | | | Most operations beyond pure json parsing need to set need_escapes to true to get access to field names and string scalars. Document this fact more explicitly. Slightly tweaked patch from: Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=c49Vkfg2+A8ubSuEtaGEjuaKZXCA6SrXA8kdwHjx3uxQ@mail.gmail.com
* Support Unicode full case mapping and conversion.Jeff Davis2025-01-17
| | | | | | | | | | | | | | | Generate tables from Unicode SpecialCasing.txt to support more sophisticated case mapping behavior: * support case mappings to multiple codepoints, such as "ß" uppercasing to "SS" * support conditional case mappings, such as the "final sigma" * support titlecase variants, such as "dž" uppercasing to "DŽ" but titlecasing to "Dž" Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org Reviewed-by: Peter Eisentraut, Daniel Verite
* Add pg_nodiscard decorations to base64 functionsPeter Eisentraut2025-01-17
| | | | | | | | | | The result of pg_b64_encode() and pg_b64_decode() should be checked for errors. This attribute could detect mistakes such as those fixed in commit ff030ebe250 and d278541be42. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEudQAq-3yHsSdWoOOaw%2BgAQYgPMpMGuB5pt2yCXgv-YuxG2Hg%40mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Remove pgrminclude annotationsPeter Eisentraut2024-12-24
| | | | | | | | | | | | | | | | | | Per git log, the last time someone tried to do something with pgrminclude was around 2011. Many (not all) of the "pgrminclude ignore" annotations are of a newer date but seem to have just been copied around during refactorings and file moves and don't seem to reflect an actual need anymore. There have been some parallel experiments with include-what-you-use (IWYU) annotations, but these don't seem to correspond very strongly to pgrminclude annotations, so there is no value in keeping the existing ones even for that kind of thing. So, wipe them all away. We can always add new ones in the future based on actual needs. Discussion: https://www.postgresql.org/message-id/flat/2d4dc7b2-cb2e-49b1-b8ca-ba5f7024f05b%40eisentraut.org
* Fix various overflow hazards in date and timestamp functions.Nathan Bossart2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | This commit makes use of the overflow-aware routines in int.h to fix a variety of reported overflow bugs in the date and timestamp code. It seems unlikely that this fixes all such bugs in this area, but since the problems seem limited to cases that are far beyond any realistic usage, I'm not going to worry too much. Note that for one bug, I've chosen to simply add a comment about the overflow hazard because fixing it would require quite a bit of code restructuring that doesn't seem worth the risk. Since this is a bug fix, it could be back-patched, but given the risk of conflicts with the new routines in int.h and the overall risk/reward ratio of this patch, I've opted not to do so for now. Fixes bug #18585 (except for the one case that's just commented). Reported-by: Alexander Lakhin Author: Matthew Kim, Nathan Bossart Reviewed-by: Joseph Koshakow, Jian He Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com Discussion: https://postgr.es/m/18585-db646741dd649abd%40postgresql.org
* Remove useless casts to (void *)Peter Eisentraut2024-11-28
| | | | | | | | Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
* jsonapi: add lexer option to keep token ownershipAndrew Dunstan2024-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0785d1b8b adds support for libpq as a JSON client, but allocations for string tokens can still be leaked during parsing failures. This is tricky to fix for the object_field semantic callbacks: the field name must remain valid until the end of the object, but if a parsing error is encountered partway through, object_field_end() won't be invoked and the client won't get a chance to free the field name. This patch adds a flag to switch the ownership of parsed tokens to the lexer. When this is enabled, the client must make a copy of any tokens it wants to persist past the callback lifetime, but the lexer will handle necessary cleanup on failure. Backend uses of the JSON parser don't need to use this flag, since the parser's allocations will occur in a short lived memory context. A -o option has been added to test_json_parser_incremental to exercise the new setJsonLexContextOwnsTokens() API, and the test_json_parser TAP tests make use of it. (The test program now cleans up allocated memory, so that tests can be usefully run under leak sanitizers.) Author: Jacob Champion Discussion: https://postgr.es/m/CAOYmi+kb38EciwyBQOf9peApKGwraHqA7pgzBkvoUnw5BRfS1g@mail.gmail.com
* Clean up newlines following left parenthesesÁlvaro Herrera2024-11-26
| | | | | | | | Most came in during the 17 cycle, so backpatch there. Some (particularly reorderbuffer.h) are very old, but backpatching doesn't seem useful. Like commits c9d297751959, c4f113e8fef9.
* Unify src/common/'s definitions of MaxAllocSize.Tom Lane2024-10-28
| | | | | | | | | | As threatened in the previous patch, define MaxAllocSize in src/include/common/fe_memutils.h rather than having several copies of it in different src/common/*.c files. This also provides an opportunity to document it better. While this would probably be safe to back-patch, I'll refrain (for now anyway).
* File size in a backup manifest should use uint64, not size_t.Robert Haas2024-10-02
| | | | | | | | size_t is the size of an object in memory, not the size of a file on disk. Thanks to Tom Lane for noting the error. Discussion: http://postgr.es/m/1865585.1727803933@sss.pgh.pa.us
* common/jsonapi: support libpq as a clientPeter Eisentraut2024-09-11
| | | | | | | | | | | | | | Based on a patch by Michael Paquier. For libpq, use PQExpBuffer instead of StringInfo. This requires us to track allocation failures so that we can return JSON_OUT_OF_MEMORY as needed rather than exit()ing. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
* Define PG_TBLSPC_DIR for path pg_tblspc/ in data folderMichael Paquier2024-09-03
| | | | | | | | | | | | | | Similarly to 2065ddf5e34c, this introduces a define for "pg_tblspc". This makes the style more consistent with the existing PG_STAT_TMP_DIR, for example. There is a difference with the other cases with the introduction of PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
* Remove support for OpenSSL older than 1.1.0Daniel Gustafsson2024-09-02
| | | | | | | | | | | | | | OpenSSL 1.0.2 has been EOL from the upstream OpenSSL project for some time, and is no longer the default OpenSSL version with any vendor which package PostgreSQL. By retiring support for OpenSSL 1.0.2 we can remove a lot of no longer required complexity for managing state within libcrypto which is now handled by OpenSSL. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/ZG3JNursG69dz1lr@paquier.xyz Discussion: https://postgr.es/m/CA+hUKGKh7QrYzu=8yWEUJvXtMVm_CNWH1L_TLWCbZMwbi1XP2Q@mail.gmail.com
* Remove dependence on -fwrapv semantics in a few places.Nathan Bossart2024-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | This commit attempts to update a few places, such as the money, numeric, and timestamp types, to no longer rely on signed integer wrapping for correctness. This is intended to move us closer towards removing -fwrapv, which may enable some compiler optimizations. However, there is presently no plan to actually remove that compiler option in the near future. Besides using some of the existing overflow-aware routines in int.h, this commit introduces and makes use of some new ones. Specifically, it adds functions that accept a signed integer and return its absolute value as an unsigned integer with the same width (e.g., pg_abs_s64()). It also adds functions that accept an unsigned integer, store the result of negating that integer in a signed integer with the same width, and return whether the negation overflowed (e.g., pg_neg_u64_overflow()). Finally, this commit adds a couple of tests for timestamps near POSTGRES_EPOCH_JDATE. Author: Joseph Koshakow Reviewed-by: Tom Lane, Heikki Linnakangas, Jian He Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
* Make nullSemAction const, add 'const' decorators to related functionsHeikki Linnakangas2024-08-06
| | | | | | | To make it more clear that these should never be modified. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
* parse_manifest: Use const char *Peter Eisentraut2024-06-21
| | | | | | | | This adapts the manifest parsing code to take advantage of the const-ified jsonapi. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
* jsonapi: Use const char *Peter Eisentraut2024-06-21
| | | | | | | | | | Apply const qualifiers to char * arguments and fields throughout the jsonapi. This allows the top-level APIs such as pg_parse_json_incremental() to declare their input argument as const. It also reduces the number of unconstify() calls. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
* jsonapi: Use size_tPeter Eisentraut2024-06-21
| | | | | | | | Use size_t instead of int for object sizes in the jsonapi. This makes the API better self-documenting. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
* Harmonize function parameter names for Postgres 17.Peter Geoghegan2024-06-12
| | | | | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 17 development. pg_bsd_indent still has a couple of similar inconsistencies, which I (pgeoghegan) have left untouched for now. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit 035ce1fe).
* Fix incorrect year in some copyright notices added this yearDavid Rowley2024-05-15
| | | | | | | A few patches that were written <= 2023 and committed in 2024 still contained 2023 copyright year. Fix that. Discussion: https://postgr.es/m/CAApHDvr5egyW3FmHbAg-Uq2p_Aizwco1Zjs6Vbq18KqN64-hRA@mail.gmail.com
* Pre-beta mechanical code beautification.Tom Lane2024-05-14
| | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. The pgindent part of this is pretty small, consisting mainly of fixing up self-inflicted formatting damage from patches that hadn't bothered to add their new typedefs to typedefs.list. In order to keep it from making anything worse, I manually added a dozen or so typedefs that appeared in the existing typedefs.list but not in the buildfarm's list. Perhaps we should formalize that, or better find a way to get those typedefs into the automatic list. pgperltidy is as opinionated as always, and reformat-dat-files too.
* Fix an assortment of typosDavid Rowley2024-05-04
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
* Fix typos and duplicate wordsDaniel Gustafsson2024-04-18
| | | | | | | | | | | | This fixes various typos, duplicated words, and tiny bits of whitespace mainly in code comments but also in docs. Author: Daniel Gustafsson <daniel@yesql.se> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
* Fix some memory leaks associated with parsing json and manifestsAndrew Dunstan2024-04-12
| | | | | | | | | | | Coverity complained about not freeing some memory associated with incrementally parsing backup manifests. To fix that, provide and use a new shutdown function for the JsonManifestParseIncrementalState object, in line with a suggestion from Tom Lane. While analysing the problem, I noticed a buglet in freeing memory for incremental json lexers. To fix that remove a bogus condition on freeing the memory allocated for them.
* Speed up tail processing when hashing aligned C strings, take twoJohn Naylor2024-04-06
| | | | | | | | | | | | | | | | | After encountering the NUL terminator, the word-at-a-time loop exits and we must hash the remaining bytes. Previously we calculated the terminator's position and re-loaded the remaining bytes from the input string. This was slower than the unaligned case for very short strings. We already have all the data we need in a register, so let's just mask off the bytes we need and hash them immediately. In addition to endianness issues, the previous attempt upset valgrind in the way it computed the mask. Whether by accident or by wisdom, the author's proposed method passes locally with valgrind 3.22. Ants Aasma, with cosmetic adjustments by me Discussion: https://postgr.es/m/CANwKhkP7pCiW_5fAswLhs71-JKGEz1c1%2BPC0a_w1fwY4iGMqUA%40mail.gmail.com
* Teach fasthash_accum to use platform endianness for bytewise loadsJohn Naylor2024-04-06
| | | | | | | | | | | | | This function previously used a mix of word-wise loads and bytewise loads. The bytewise loads happened to be little-endian regardless of platform. This in itself is not a problem. However, a future commit will require the same result whether A) the input is loaded as a word with the relevent bytes masked-off, or B) the input is loaded one byte at a time. While at it, improve debuggability of the internal hash state. Discussion: https://postgr.es/m/CANWCAZZpuV1mES1mtSpAq8tWJewbrv4gEz6R_k4gzNG8GZ5gag%40mail.gmail.com
* Convert uses of hash_string_pointer to fasthash equivalentJohn Naylor2024-04-06
| | | | | | | | | | | | | | Remove duplicate hash_string_pointer() function definitions by creating a new inline function hash_string() for this purpose. This has the added advantage of avoiding strlen() calls when doing hash lookup. It's not clear how many of these are perfomance-sensitive enough to benefit from that, but the simplification is worth it on its own. Reviewed by Jeff Davis Discussion: https://postgr.es/m/CANWCAZbg_XeSeY0a_PqWmWqeRATvzTzUNYRLeT%2Bbzs%2BYQdC92g%40mail.gmail.com
* Add macro to disable address safety instrumentationJohn Naylor2024-04-06
| | | | | | | | | | | | | fasthash_accum_cstring_aligned() uses a technique, found in various strlen() implementations, to detect a string's NUL terminator by reading a word at at time. That triggers failures when testing with "-fsanitize=address", at least with frontend code. To enable using this function anywhere, add a function attribute macro to disable such testing. Reviewed by Jeff Davis Discussion: https://postgr.es/m/CANWCAZbwvp7oUEkbw-xP4L0_S_WNKq-J-ucP4RCNDPJnrakUPw%40mail.gmail.com
* Fix incorrect return typeJohn Naylor2024-04-06
| | | | | | | | | fasthash32() calculates a 32-bit hashcode, but the return type was uint64. Change to uint32. Noted by Jeff Davis Discussion: https://postgr.es/m/b16c93e6c736a422d4de668343515375664eb05d.camel%40j-davis.com
* Add support for incrementally parsing backup manifestsAndrew Dunstan2024-04-04
| | | | | | | | | | | | This adds the infrastructure for using the new non-recursive JSON parser in processing manifests. It's important that callers make sure that the last piece of json handed to the incremental manifest parser contains the entire last few lines of the manifest, including the checksum. Author: Andrew Dunstan Reviewed-By: Jacob Champion Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net
* Introduce a non-recursive JSON parserAndrew Dunstan2024-04-04
| | | | | | | | | | | | | | | | | | | | | | | This parser uses an explicit prediction stack, unlike the present recursive descent parser where the parser state is represented on the call stack. This difference makes the new parser suitable for use in incremental parsing of huge JSON documents that cannot be conveniently handled piece-wise by the recursive descent parser. One potential use for this will be in parsing large backup manifests associated with incremental backups. Because this parser is somewhat slower than the recursive descent parser, it is not replacing that parser, but is an additional parser available to callers. For testing purposes, if the build is done with -DFORCE_JSON_PSTACK, all JSON parsing is done with the non-recursive parser, in which case only trivial regression differences in error messages should be observed. Author: Andrew Dunstan Reviewed-By: Jacob Champion Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net
* Revert "Speed up tail processing when hashing aligned C strings"John Naylor2024-03-31
| | | | | | | This reverts commit 07f0f6abfc7f6c55cede528d9689dedecefc734a. This has shown failures on both Valgrind and big-endian machines, per members skink and pike.
* Speed up tail processing when hashing aligned C stringsJohn Naylor2024-03-31
| | | | | | | | | | | | | | | After encountering the NUL terminator, the word-at-a-time loop exits and we must hash the remaining bytes. Previously we calculated the terminator's position and re-loaded the remaining bytes from the input string. We already have all the data we need in a register, so let's just mask off the bytes we need and hash them immediately. The mask can be cheaply computed without knowing the terminator's position. We still need that position for the length calculation, but the CPU can now do that in parallel with other work, shortening the dependency chain. Ants Aasma and John Naylor Discussion: https://postgr.es/m/CANwKhkP7pCiW_5fAswLhs71-JKGEz1c1%2BPC0a_w1fwY4iGMqUA%40mail.gmail.com
* Add unicode_strtitle() for Unicode Default Case Conversion.Jeff Davis2024-03-29
| | | | | | | | | | | | | This brings the titlecasing implementation for the builtin provider out of formatting.c and into unicode_case.c, along with unicode_strlower() and unicode_strupper(). Accepts an arbitrary word boundary callback. Simple for now, but can be extended to support the Unicode Default Case Conversion algorithm with full case mapping. Discussion: https://postgr.es/m/3bc653b5d562ae9e2838b11cb696816c328a489a.camel@j-davis.com Reviewed-by: Peter Eisentraut
* Add functions to generate random numbers in a specified range.Dean Rasheed2024-03-27
| | | | | | | | | | | | | | | | | | | | | | | | This adds 3 new variants of the random() function: random(min integer, max integer) returns integer random(min bigint, max bigint) returns bigint random(min numeric, max numeric) returns numeric Each returns a random number x in the range min <= x <= max. For the numeric function, the number of digits after the decimal point is equal to the number of digits that "min" or "max" has after the decimal point, whichever has more. The main entry points for these functions are in a new C source file. The existing random(), random_normal(), and setseed() functions are moved there too, so that they can all share the same PRNG state, which is kept private to that file. Dean Rasheed, reviewed by Jian He, David Zhang, Aleksander Alekseev, and Tomas Vondra. Discussion: https://postgr.es/m/CAEZATCV89Vxuq93xQdmc0t-0Y2zeeNQTdsjbmV7dyFBPykbV4Q@mail.gmail.com