aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/pg_locale.c
Commit message (Collapse)AuthorAge
...
* pg_collation_actual_version() -> pg_collation_current_version().Thomas Munro2021-02-22
| | | | | | The new name seems a bit more natural. Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
* Hide internal error for pg_collation_actual_version(<bad OID>).Thomas Munro2021-02-22
| | | | | | | | | Instead of an unsightly internal "cache lookup failed" message, just return NULL for bad OIDs, as is the convention for other similar things. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Use setenv() in preference to putenv().Tom Lane2020-12-30
| | | | | | | | | | | | | | | | | | | | | | | | Since at least 2001 we've used putenv() and avoided setenv(), on the grounds that the latter was unportable and not in POSIX. However, POSIX added it that same year, and by now the situation has reversed: setenv() is probably more portable than putenv(), since POSIX now treats the latter as not being a core function. And setenv() has cleaner semantics too. So, let's reverse that old policy. This commit adds a simple src/port/ implementation of setenv() for any stragglers (we have one in the buildfarm, but I'd not be surprised if that code is never used in the field). More importantly, extend win32env.c to also support setenv(). Then, replace usages of putenv() with setenv(), and get rid of some ad-hoc implementations of setenv() wannabees. Also, adjust our src/port/ implementation of unsetenv() to follow the POSIX spec that it returns an error indicator, rather than returning void as per the ancient BSD convention. I don't feel a need to make all the call sites check for errors, but the portability stub ought to match real-world practice. Discussion: https://postgr.es/m/2065122.1609212051@sss.pgh.pa.us
* Improve hash_create()'s API for some added robustness.Tom Lane2020-12-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
* Add collation versions for FreeBSD.Thomas Munro2020-11-20
| | | | | | | | On FreeBSD 13, use querylocale() to read the current version of libc collations. Similar to commits 352f6f2d for Windows and d5ac14f9 for GNU/Linux. Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
* Suppress "warning: variable 'collcollate' set but not used".Tom Lane2020-11-15
| | | | | | | | | Buildfarm members that lack both HAVE_LOCALE_T and USE_ICU have been complaining about pg_newlocale_from_collation's collcollate variable. This is evidently fallout from commit 7d1297df0, which removed the only usage outside those two #ifdef'd code paths. Mark the variable pg_attribute_unused(), like its sibling collctype, which has been that way for a long time.
* Fix assertion in collation version lookup.Thomas Munro2020-11-08
| | | | | | | Commit 257836a7 included an assertion that a version lookup routine is not trying to look up "C" or "POSIX", but that case is reachable with the user-facing SQL function pg_collation_actual_version(). Remove the assertion.
* Tolerate version lookup failure for old style Windows locale names.Thomas Munro2020-11-04
| | | | | | | | | | | | | | | | | | | Accept that we can't get versions for such locale names for now. Users will need to specify the newer language tag format to enable the collation versioning feature. It's not clear that we can do automatic conversion from the old style to the new style reliably enough for this purpose. Unfortunately, this means that collation versioning probably won't work for the default collation unless you provide something like en-US at initdb or CREATE DATABASE time (though, for reasons not yet understood, it does seem to work on some systems). It'd be nice to find a better solution, or document this quirk if we settle on it, but this should unbreak the 3 failing build farm animals in the meantime. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
* Track collation versions for indexes.Thomas Munro2020-11-03
| | | | | | | | | | | | | | | Record the current version of dependent collations in pg_depend when creating or rebuilding an index. When accessing the index later, warn that the index may be corrupted if the current version doesn't match. Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe, Michael Paquier, Robert Haas, Tom Lane and others for very helpful discussion. Author: Thomas Munro <thomas.munro@gmail.com> Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions) Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
* Remove pg_collation.collversion.Thomas Munro2020-11-03
| | | | | | | | | | | | This model couldn't be extended to cover the default collation, and didn't have any information about the affected database objects when the version changed. Remove, in preparation for a follow-up commit that will add a new mechanism. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
* Replace remaining StrNCpy() by strlcpy()Peter Eisentraut2020-08-10
| | | | | | | | | | | | | | | | | They are equivalent, except that StrNCpy() zero-fills the entire destination buffer instead of providing just one trailing zero. For all but a tiny number of callers, that's just overhead rather than being desirable. Remove StrNCpy() as it is now unused. In some cases, namestrcpy() is the more appropriate function to use. While we're here, simplify the API of namestrcpy(): Remove the return value, don't check for NULL input. Nothing was using that anyway. Also, remove a few unused name-related functions. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/44f5e198-36f6-6cdb-7fa9-60e34784daae%402ndquadrant.com
* Fix the MSVC build for versions 2015 and later.Amit Kapila2020-05-14
| | | | | | | | | | | | | | | | | | | Visual Studio 2015 and later versions should still be able to do the same as Visual Studio 2012, but the declaration of locale_name is missing in _locale_t, causing the code compilation to fail, hence this falls back instead on to enumerating all system locales by using EnumSystemLocalesEx to find the required locale name.  If the input argument is in Unix-style then we can get ISO Locale name directly by using GetLocaleInfoEx() with LCType as LOCALE_SNAME. In passing, change the documentation references of the now obsolete links. Note that this problem occurs only with NLS enabled builds. Author: Juan José Santamaría Flecha, Davinder Singh and Amit Kapila Reviewed-by: Ranier Vilela and Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CAHzhFSFoJEWezR96um4-rg5W6m2Rj9Ud2CNZvV4NWc9tXV7aXQ@mail.gmail.com
* Add collation versions for Windows.Thomas Munro2020-03-25
| | | | | | | | On Vista and later, use GetNLSVersionEx() to request collation version information. Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
* Allow NULL version for individual collations.Thomas Munro2020-03-25
| | | | | | | | | | | Remove the documented restriction that collation providers must either return NULL for all collations or non-NULL for all collations. Use NULL for glibc collations like "C.UTF-8", which might otherwise lead future proposed commits to force unnecessary index rebuilds. Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
* Allow to_date/to_timestamp to recognize non-English month/day names.Tom Lane2020-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | to_char() has long allowed the TM (translation mode) prefix to specify output of translated month or day names; but that prefix had no effect in input format strings. Now it does. to_date() and to_timestamp() will now recognize the same month or day names that to_char() would output for the same format code. Matching is case-insensitive (per the active collation's notion of what that means), just as it has always been for English month/day names without the TM prefix. (As per the discussion thread, there are lots of cases that this feature will not handle, such as alternate day names. But being able to accept what to_char() will output seems useful enough.) In passing, fix some shaky English and violations of message style guidelines in jsonpath errors for the .datetime() method, which depends on this code. Juan José Santamaría Flecha, reviewed and modified by me, with other commentary from Alvaro Herrera, Tomas Vondra, Arthur Zakirov, Peter Eisentraut, Mark Dilger. Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
* Rationalize code placement between wchar.c, encnames.c, and mbutils.c.Tom Lane2020-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Move all the backend-only code that'd crept into wchar.c and encnames.c into mbutils.c. To remove the last few #ifdef dependencies from wchar.c and encnames.c, also make the following changes: * Adjust get_encoding_name_for_icu to return NULL, not throw an error, for unsupported encodings. Its sole caller can perfectly well throw an error instead. (While at it, I also made this function and its sibling is_encoding_supported_by_icu proof against out-of-range encoding IDs.) * Remove the overlength-name error condition from pg_char_to_encoding. It's completely silly not to treat that just like any other the-name-is-not-in-the-table case. Also, get rid of pg_mic_mblen --- there's no obvious reason why conv.c shouldn't call pg_mule_mblen instead. Other than that, this is just code movement and comment-polishing with no functional changes. Notably, I reordered declarations in pg_wchar.h to show which functions are frontend-accessible and which are not. Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Use libc version as a collation version on glibc systems.Thomas Munro2019-10-16
| | | | | | | | | | | Using glibc's version string to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Currently this affects only collations explicitly provided by "libc". More work will be needed to handle the default collation. Author: Thomas Munro, based on a suggestion from Christoph Berg Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
* Revert "Use libc version as a collation version on glibc systems."Peter Eisentraut2019-10-09
| | | | | | This reverts commit 9f90b1d08d796a925808b24f77f624a0ff682c77. This needs some refinements in the pg_dump and pg_upgrade tests.
* Use libc version as a collation version on glibc systems.Peter Eisentraut2019-10-09
| | | | | | | | | Using glibc's version number to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Author: Thomas Munro Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
* Remove some code for old unsupported versions of MSVCPeter Eisentraut2019-10-08
| | | | | | | | | | | | As of d9dd406fe281d22d5238d3c26a7182543c711e74, we require MSVC 2013, which means _MSC_VER >= 1800. This means that conditionals about older versions of _MSC_VER can be removed or simplified. Previous code was also in some cases handling MinGW, where _MSC_VER is not defined at all, incorrectly, such as in pg_ctl.c and win32_port.h, leading to some compiler warnings. This should now be handled better. Reviewed-by: Michael Paquier <michael@paquier.xyz>
* Fix inconsistencies and typos in the tree, take 11Michael Paquier2019-08-19
| | | | | | | | This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com
* Fix inconsistencies and typos in the tree, take 10Michael Paquier2019-08-13
| | | | | | | | | This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
* More message style fixesAlvaro Herrera2019-05-16
| | | | Discussion: https://postgr.es/m/20190515183005.GA26486@alvherre.pgsql
* Repair assorted issues in locale data extraction.Tom Lane2019-04-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cache_locale_time (extraction of LC_TIME-related info) had never been taught the lessons we previously learned about extraction of info related to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught PGLC_localeconv() that data coming out of localeconv() was in an encoding determined by the relevant locale, but we didn't realize that there's a similar issue with strftime(). And commit a4930e7ca hardened PGLC_localeconv() against errors occurring partway through, but failed to do likewise for cache_locale_time(). So, rearrange the latter function to perform encoding conversion and not risk failure while it's got the locales set to temporary values. This time around I also changed PGLC_localeconv() to treat it as FATAL if it can't restore the previous settings of the locale values. There is no reason (except possibly OOM) for that to fail, and proceeding with the wrong locale values seems like a seriously bad idea --- especially on Windows where we have to also temporarily change LC_CTYPE. Also, protect against the possibility that we can't identify the codeset reported for LC_MONETARY or LC_NUMERIC; rather than just failing, try to validate the data without conversion. The user-visible symptom this fixes is that if LC_TIME is set to a locale name that implies an encoding different from the database encoding, non-ASCII localized day and month names would be retrieved in the wrong encoding, leading to either unexpected encoding-conversion error reports or wrong output from to_char(). The other possible failure modes are unlikely enough that we've not seen reports of them, AFAIK. The encoding conversion problems do not manifest on Windows, since we'd already created special-case code to handle that issue there. Per report from Juan José Santamaría Flecha. Back-patch to all supported versions. Juan José Santamaría Flecha and Tom Lane Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com
* Collations with nondeterministic comparisonPeter Eisentraut2019-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
* Fix bug in support for collation attributes on older ICU versionsPeter Eisentraut2019-03-19
| | | | | | | | | | | | | | Unrecognized attribute names are supposed to be ignored. But the code would error out on an unrecognized attribute value even if it did not recognize the attribute name. So unrecognized attributes wouldn't really be ignored unless the value happened to be one that matched a recognized value. This would break some important cases where the attribute would be processed by ucol_open() directly. Fix that and add a test case. The restructured code should also avoid compiler warnings about initializing a UColAttribute value to -1, because the type might be an unsigned enum. (reported by Andres Freund)
* Add support for collation attributes on older ICU versionsPeter Eisentraut2019-03-17
| | | | | | | | | | | | | | | | Starting in ICU 54, collation customization attributes can be specified in the locale string, for example "@colStrength=primary;colCaseLevel=yes". Add support for this for older ICU versions as well, by adding some minimal parsing of the attributes in the locale string and calling ucol_setAttribute() on them. This is essentially what never ICU versions do internally in ucol_open(). This was we can offer this functionality in a consistent way in all ICU versions supported by PostgreSQL. Also add some tests for ICU collation customization. Reported-by: Daniel Verite <daniel@manitou-mail.org> Discussion: https://www.postgresql.org/message-id/0270ebd4-f67c-8774-1a5a-91adfb9bb41f@2ndquadrant.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Assume wcstombs(), towlower(), and sibling functions are always present.Tom Lane2017-09-22
| | | | | | | | | | | | | | | These functions are required by SUS v2, which is our minimum baseline for Unix platforms, and are present on all interesting Windows versions as well. Even our oldest buildfarm members have them. Thus, we were not testing the "!USE_WIDE_UPPER_LOWER" code paths, which explains why the bug fixed in commit e6023ee7f escaped detection. Per discussion, there seems to be no more real-world value in maintaining this option. Hence, remove the configure-time tests for wcstombs() and towlower(), remove the USE_WIDE_UPPER_LOWER symbol, and remove all the !USE_WIDE_UPPER_LOWER code. There's not actually all that much of the latter, but simplifying the #if nests is a win in itself. Discussion: https://postgr.es/m/20170921052928.GA188913@rfd.leadboat.com
* Improve dubious memory management in pg_newlocale_from_collation().Tom Lane2017-09-20
| | | | | | | | | | | | | | | | | | | | | pg_newlocale_from_collation() used malloc() and strdup() directly, which is generally not per backend coding style, and it didn't bother to check for failure results, but would just SIGSEGV instead. Also, if one of the numerous error checks in the middle of the function failed, the already-allocated memory would be leaked permanently. Admittedly, it's not a lot of memory, but it could build up if this function were called repeatedly for a bad collation. The first two problems are easily cured by palloc'ing in TopMemoryContext instead of calling libc directly. We can fairly easily dodge the leakage problem for the struct pg_locale_struct by filling in a temporary variable and allocating permanent storage only once we reach the bottom of the function. It's harder to get rid of the potential leakage for ICU's copy of the collcollate string, but at least that's only allocated after most of the error checks; so live with that aspect. Back-patch to v10 where this code came in, with one or another of the ICU patches.
* Second try at getting useful errors out of newlocale/_create_locale.Tom Lane2017-08-01
| | | | | | | | | | | | | | | | | | The early buildfarm returns for commit 1e165d05f are pretty awful: not only does Windows not return a useful error, but it looks like a lot of Unix-ish platforms don't either. Given the number of different errnos seen so far, guess that what's really going on is that some newlocale() implementations fail to set errno at all. Hence, let's try zeroing errno just before newlocale() and then if it's still zero report as though it's ENOENT. That should cover the Windows case too. It's clear that we'll have to drop the regression test case, unless we want to maintain a separate expected-file for platforms without HAVE_LOCALE_T. But I'll leave it there awhile longer to see if this actually improves matters or not. Discussion: https://postgr.es/m/CAKKotZS-wcDcofXDCH=sidiuajE+nqHn2CGjLLX78anyDmi3gQ@mail.gmail.com
* Try to deliver a sane message for _create_locale() failure on Windows.Tom Lane2017-08-01
| | | | | | | | | | | | | | | | We were just printing errno, which is certainly not gonna work on Windows. Now, it's not entirely clear from Microsoft's documentation whether _create_locale() adheres to standard Windows error reporting conventions, but let's assume it does and try to map the GetLastError result to an errno. If this turns out not to work, probably the best thing to do will be to assume the error is always ENOENT on Windows. This is a longstanding bug, but given the lack of previous field complaints, I'm not excited about back-patching it. Per report from Murtuza Zabuawala. Discussion: https://postgr.es/m/CAKKotZS-wcDcofXDCH=sidiuajE+nqHn2CGjLLX78anyDmi3gQ@mail.gmail.com
* Refine memory allocation in ICU conversionsPeter Eisentraut2017-07-01
| | | | | | | | The simple calculations done to estimate the size of the output buffers for ucnv_fromUChars() and ucnv_toUChars() could overflow int32_t for large strings. To avoid that, go the long way and run the function first without an output buffer to get the correct output buffer size requirement.
* Prohibit creating ICU collation with different ctypePeter Eisentraut2017-06-30
| | | | | | ICU does not support "collate" and "ctype" being different, so the collctype catalog column is ignored. But for catalog neatness, ensure that they are the same.
* Fix memory leakage in ICU encoding conversion, and other code review.Tom Lane2017-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Callers of icu_to_uchar() neglected to pfree the result string when done with it. This results in catastrophic memory leaks in varstr_cmp(), because of our prevailing assumption that btree comparison functions don't leak memory. For safety, make all the call sites clean up leaks, though I suspect that we could get away without it in formatting.c. I audited callers of icu_from_uchar() as well, but found no places that seemed to have a comparable issue. Add function API specifications for icu_to_uchar() and icu_from_uchar(); the lack of any thought-through specification is perhaps not unrelated to the existence of this bug in the first place. Fix icu_to_uchar() to guarantee a nul-terminated result; although no existing caller appears to care, the fact that it would have been nul-terminated except in extreme corner cases seems ideally designed to bite someone on the rear someday. Fix ucnv_fromUChars() destCapacity argument --- in the worst case, that could perhaps have led to a non-nul-terminated result, too. Fix icu_from_uchar() to have a more reasonable definition of the function result --- no callers are actually paying attention, so this isn't a live bug, but it's certainly sloppily designed. Const-ify icu_from_uchar()'s input string for consistency. That is not the end of what needs to be done to these functions, but it's as much as I have the patience for right now. Discussion: https://postgr.es/m/1955.1498181798@sss.pgh.pa.us
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Preventive maintenance in advance of pgindent run.Tom Lane2017-05-16
| | | | | | | | | | Reformat various places in which pgindent will make a mess, and fix a few small violations of coding style that I happened to notice while perusing the diffs from a pgindent dry run. There is one actual bug fix here: the need-to-enlarge-the-buffer code path in icu_convert_case was obviously broken. Perhaps it's unreachable in our usage? Or maybe this is just sadly undertested.
* ICU supportPeter Eisentraut2017-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>
* Remove useless duplicate inclusions of system header files.Tom Lane2017-02-25
| | | | | | | | | | | | | | | | c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Fix PGLC_localeconv() to handle errors better.Tom Lane2016-11-21
| | | | | | | | | | | | | | | | | | | | | | | The code was intentionally not very careful about leaking strdup'd strings in case of an error. That was forgivable probably, but it also failed to notice strdup() failures, which could lead to subsequent null-pointer-dereference crashes, since many callers unsurprisingly didn't check for null pointers in the struct lconv fields. An even worse problem is that it could throw error while we were setlocale'd to a non-C locale, causing unwanted behavior in subsequent libc calls. Rewrite to ensure that we cannot throw elog(ERROR) until after we've restored the previous locale settings, or at least attempted to. (I'm sorely tempted to make restore failure be a FATAL error, but will refrain for the moment.) Having done that, it's not much more work to ensure that we clean up strdup'd storage on the way out, too. This code is substantially the same in all supported branches, so back-patch all the way. Michael Paquier and Tom Lane Discussion: <CAB7nPqRMbGqa_mesopcn4MPyTs34eqtVEK7ELYxvvV=oqS00YA@mail.gmail.com>
* Avoid multiple free_struct_lconv() calls on same data.Tom Lane2016-02-28
| | | | | | | | | | | A failure partway through PGLC_localeconv() led to a situation where the next call would call free_struct_lconv() a second time, leading to free() on already-freed strings, typically leading to a core dump. Add a flag to remember whether we need to do that. Per report from Thom Brown. His example case only provokes the failure as far back as 9.4, but nonetheless this code is obviously broken, so back-patch to all supported branches.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Revoke support for strxfrm() that write past the specified array length.Noah Misch2015-07-08
| | | | | | | This formalizes a decision implicit in commit 4ea51cdfe85ceef8afabceb03c446574daa0ac23 and adds clean detection of affected systems. Vendor updates are available for each such known bug. Back-patch to 9.5, where the aforementioned commit first appeared.