| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
Buildfarm member jacana (MinGW) has been complaining that
get_iso_localename is defined but not used. This is evidently
fallout from the recent removal of VS2013 support in pg_locale.c.
Rearrange the #ifs so that get_iso_localename and its subroutine
search_locale_enum won't get built on MinGW.
I also noticed that a comment in get_iso_localename cross-
referenced a comment in IsoLocaleName that isn't there anymore.
Put back what I think is the referenced material.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No members of the buildfarm are using this version of Visual Studio,
resulting in all the code cleaned up here as being mostly dead, and
VS2017 is the oldest version still supported.
More versions could be cut, but the gain would be minimal, while
removing only VS2013 has the advantage to remove from the core code all
the dependencies on the value defined by _MSC_VER, where compatibility
tweaks have accumulated across the years mostly around locales and
strtof(), so that's a nice isolated cleanup.
Note that this commit additionally allows a revert of 3154e16. The
versions of Visual Studio now supported range from 2015 to 2022.
Author: Michael Paquier
Reviewed-by: Juan José Santamaría Flecha, Tom Lane, Thomas Munro, Justin
Pryzby
Discussion: https://postgr.es/m/YoH2IMtxcS3ncWn+@paquier.xyz
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit bumps the runtime value of _WIN32_WINNT to be 0x0A00 for any
builds on Windows. Hence, this makes Windows 10 the minimal requirement
when running PostgreSQL under WIN32, be it for builds of Cygwin, MinGW
or Visual Studio.
The previous minimal runtime version was either Windows Vista when
building with at least Visual Studio 2015 or Windows XP for the rest.
Windows 10 is the most modern version supported by Microsoft, and per
discussion, as we don't have buildfarm members that run older versions
anymore, this is the minimal supported version that suits better for our
needs. This will actually make easier the development of some patches,
two being async I/O and large page handling by avoiding a lot of
compatibility gotchas, on platforms that have most likely few users
anyway.
It is possible to remove MIN_WINNT in win32.h and the macros
IsWindowsXXXOrGreater() that were used in the code at runtime to check
which version of Windows was getting used. The change in pg_locale.c
comes from Juan. Note that all my tests passed, and that the CI is
green. The buildfarm will quickly tell if this needs more adjustments.
Author: Michael Paquier, Juan José Santamaría Flecha
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/Yo7tHKD8VCkeNi71@paquier.xyz
|
|
|
|
|
|
|
|
|
|
| |
Per applicable standards, free() with a null pointer is a no-op.
Systems that don't observe that are ancient and no longer relevant.
Some PostgreSQL code already required this behavior, so this change
does not introduce any new requirements, just makes the code more
consistent.
Discussion: https://www.postgresql.org/message-id/flat/dac5d2d0-98f5-94d9-8e69-46da2413593d%40enterprisedb.com
|
|
|
|
|
| |
Run pgindent, pgperltidy, and reformat-dat-files.
I manually fixed a couple of comments that pgindent uglified.
|
|
|
|
|
| |
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are usually not useful since users will use packaged
distributions and won't be interested in rebuilding their installation
from source. Also, we have only used these kinds of hints for some
features and in some places, not consistently throughout.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
createdb() didn't check for collation attributes validity, which has
to be done explicitly on ICU < 54. It also forgot to close the ICU collator
opened during the check which leaks some memory.
To fix both, add a new check_icu_locale() that does all the appropriate
verification and close the ICU collator.
initdb also had some partial check for ICU < 54. To have consistent error
reporting across major ICU versions, and get rid of the need to include ucol.h,
remove the partial check there. The backend will report an error if needed
during the post-boostrap iniitialization phase.
Author: Julien Rouhaud <julien.rouhaud@free.fr>
Discussion: https://www.postgresql.org/message-id/20220319041459.qqqiqh335sga5ezj@jrouhaud
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the option to use ICU as the default locale provider for
either the whole cluster or a database. New options for initdb,
createdb, and CREATE DATABASE are used to select this.
Since some (legacy) code still uses the libc locale facilities
directly, we still need to set the libc global locale settings even if
ICU is otherwise selected. So pg_database now has three
locale-related fields: the existing datcollate and datctype, which are
always set, and a new daticulocale, which is only set if ICU is
selected. A similar change is made in pg_collation for consistency,
but in that case, only the libc-related fields or the ICU-related
field is set, never both.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com
|
|
|
|
|
| |
Update a comment that assumed that libc collations don't support
versioning. Also improve an adjacent error message a bit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the data type of the catalog fields datcollate, datctype,
collcollate, and collctype from name to text. There wasn't ever a
really good reason for them to be of type name; presumably this was
just carried over from when they were fixed-size fields in pg_control,
first into the corresponding pg_database fields, and then to
pg_collation. The values are not identifiers or object names, and we
don't ever look them up that way.
Changing to type text saves space in the typical case, since locale
names are typically only a few bytes long. But it is also possible
that an ICU locale name with several customization options appended
could be longer than 63 bytes, so this also enables that case, which
was previously probably broken.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, callers of pg_newlocale_from_collation() did not call it
if the collation was DEFAULT_COLLATION_OID and instead proceeded with
a pg_locale_t of 0. Instead, now we call it anyway and have it return
0 if the default collation was passed. It already did this, so we
just have to adjust the callers. This simplifies all the call sites
and also makes future enhancements easier.
After discussion and testing, the previous comment in pg_locale.c
about avoiding this for performance reasons may have been mistaken
since it was testing a very different patch version way back when.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/ed3baa81-7fac-7788-cc12-41e3f7917e34@enterprisedb.com
|
|
|
|
| |
Backpatch-through: 10
|
|
|
|
|
| |
Author: Lingjie Qiang
Discussion: https://postgr.es/m/OSAPR01MB71654E773F62AC88DC1FC8CC80669@OSAPR01MB7165.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
| |
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Design problems were discovered in the handling of composite types and
record types that would cause some relevant versions not to be recorded.
Misgivings were also expressed about the use of the pg_depend catalog
for this purpose. We're out of time for this release so we'll revert
and try again.
Commits reverted:
1bf946bd: Doc: Document known problem with Windows collation versions.
cf002008: Remove no-longer-relevant test case.
ef387bed: Fix bogus collation-version-recording logic.
0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>).
ff942057: Suppress "warning: variable 'collcollate' set but not used".
d50e3b1f: Fix assertion in collation version lookup.
f24b1569: Rethink extraction of collation dependencies.
257836a7: Track collation versions for indexes.
cd6f479e: Add pg_depend.refobjversion.
7d1297df: Remove pg_collation.collversion.
Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
|
|
|
|
|
|
|
| |
This reverts commit 9cf184cc0599b6e65e7e5ecd9d91cd42e278bcd8. Name
change less well received than anticipated.
Discussion: https://postgr.es/m/afcfb97e-88a1-a540-db95-6c573b93bc2b%40eisentraut.org
|
|
|
|
|
|
|
|
| |
The code paths for three different OSes finished up with three different
ways of excluding C[.xxx] and POSIX from consideration. Merge them.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
|
|
|
|
|
|
| |
The new name seems a bit more natural.
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
|
|
|
|
|
|
|
|
|
| |
Instead of an unsightly internal "cache lookup failed" message, just
return NULL for bad OIDs, as is the convention for other similar things.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
|
|
|
|
| |
Backpatch-through: 9.5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since at least 2001 we've used putenv() and avoided setenv(), on the
grounds that the latter was unportable and not in POSIX. However,
POSIX added it that same year, and by now the situation has reversed:
setenv() is probably more portable than putenv(), since POSIX now
treats the latter as not being a core function. And setenv() has
cleaner semantics too. So, let's reverse that old policy.
This commit adds a simple src/port/ implementation of setenv() for
any stragglers (we have one in the buildfarm, but I'd not be surprised
if that code is never used in the field). More importantly, extend
win32env.c to also support setenv(). Then, replace usages of putenv()
with setenv(), and get rid of some ad-hoc implementations of setenv()
wannabees.
Also, adjust our src/port/ implementation of unsetenv() to follow the
POSIX spec that it returns an error indicator, rather than returning
void as per the ancient BSD convention. I don't feel a need to make
all the call sites check for errors, but the portability stub ought
to match real-world practice.
Discussion: https://postgr.es/m/2065122.1609212051@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS).
Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes. This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)
Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize. Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.
Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one. This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not. We might as well save a few
cycles by standardizing on "not".
Also improve the documentation for hash_create().
In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.
Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
|
|
|
|
|
|
|
|
| |
On FreeBSD 13, use querylocale() to read the current version of libc
collations. Similar to commits 352f6f2d for Windows and d5ac14f9 for
GNU/Linux.
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Buildfarm members that lack both HAVE_LOCALE_T and USE_ICU have been
complaining about pg_newlocale_from_collation's collcollate variable.
This is evidently fallout from commit 7d1297df0, which removed the
only usage outside those two #ifdef'd code paths. Mark the variable
pg_attribute_unused(), like its sibling collctype, which has been that
way for a long time.
|
|
|
|
|
|
|
| |
Commit 257836a7 included an assertion that a version lookup routine is
not trying to look up "C" or "POSIX", but that case is reachable with
the user-facing SQL function pg_collation_actual_version(). Remove the
assertion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accept that we can't get versions for such locale names for now. Users
will need to specify the newer language tag format to enable the
collation versioning feature. It's not clear that we can do automatic
conversion from the old style to the new style reliably enough for this
purpose.
Unfortunately, this means that collation versioning probably won't work
for the default collation unless you provide something like en-US at
initdb or CREATE DATABASE time (though, for reasons not yet understood,
it does seem to work on some systems). It'd be nice to find a better
solution, or document this quirk if we settle on it, but this should
unbreak the 3 failing build farm animals in the meantime.
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index. When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.
Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
This model couldn't be extended to cover the default collation, and
didn't have any information about the affected database objects when the
version changed. Remove, in preparation for a follow-up commit that
will add a new mechanism.
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They are equivalent, except that StrNCpy() zero-fills the entire
destination buffer instead of providing just one trailing zero. For
all but a tiny number of callers, that's just overhead rather than
being desirable.
Remove StrNCpy() as it is now unused.
In some cases, namestrcpy() is the more appropriate function to use.
While we're here, simplify the API of namestrcpy(): Remove the return
value, don't check for NULL input. Nothing was using that anyway.
Also, remove a few unused name-related functions.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/44f5e198-36f6-6cdb-7fa9-60e34784daae%402ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Visual Studio 2015 and later versions should still be able to do the same
as Visual Studio 2012, but the declaration of locale_name is missing in
_locale_t, causing the code compilation to fail, hence this falls back
instead on to enumerating all system locales by using EnumSystemLocalesEx
to find the required locale name. If the input argument is in Unix-style
then we can get ISO Locale name directly by using GetLocaleInfoEx() with
LCType as LOCALE_SNAME.
In passing, change the documentation references of the now obsolete links.
Note that this problem occurs only with NLS enabled builds.
Author: Juan José Santamaría Flecha, Davinder Singh and Amit Kapila
Reviewed-by: Ranier Vilela and Amit Kapila
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CAHzhFSFoJEWezR96um4-rg5W6m2Rj9Ud2CNZvV4NWc9tXV7aXQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
On Vista and later, use GetNLSVersionEx() to request collation version
information.
Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the documented restriction that collation providers must either
return NULL for all collations or non-NULL for all collations.
Use NULL for glibc collations like "C.UTF-8", which might otherwise lead
future proposed commits to force unnecessary index rebuilds.
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to_char() has long allowed the TM (translation mode) prefix to
specify output of translated month or day names; but that prefix
had no effect in input format strings. Now it does. to_date()
and to_timestamp() will now recognize the same month or day names
that to_char() would output for the same format code. Matching is
case-insensitive (per the active collation's notion of what that
means), just as it has always been for English month/day names
without the TM prefix.
(As per the discussion thread, there are lots of cases that this
feature will not handle, such as alternate day names. But being
able to accept what to_char() will output seems useful enough.)
In passing, fix some shaky English and violations of message
style guidelines in jsonpath errors for the .datetime() method,
which depends on this code.
Juan José Santamaría Flecha, reviewed and modified by me,
with other commentary from Alvaro Herrera, Tomas Vondra,
Arthur Zakirov, Peter Eisentraut, Mark Dilger.
Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move all the backend-only code that'd crept into wchar.c and encnames.c
into mbutils.c.
To remove the last few #ifdef dependencies from wchar.c and encnames.c,
also make the following changes:
* Adjust get_encoding_name_for_icu to return NULL, not throw an error,
for unsupported encodings. Its sole caller can perfectly well throw an
error instead. (While at it, I also made this function and its sibling
is_encoding_supported_by_icu proof against out-of-range encoding IDs.)
* Remove the overlength-name error condition from pg_char_to_encoding.
It's completely silly not to treat that just like any other
the-name-is-not-in-the-table case.
Also, get rid of pg_mic_mblen --- there's no obvious reason why
conv.c shouldn't call pg_mule_mblen instead.
Other than that, this is just code movement and comment-polishing with
no functional changes. Notably, I reordered declarations in pg_wchar.h
to show which functions are frontend-accessible and which are not.
Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
|
|
|
|
| |
Backpatch-through: update all files in master, backpatch legal files through 9.4
|
|
|
|
|
|
|
|
|
|
|
| |
Using glibc's version string to detect potential collation definition
changes is not 100% reliable, but it's better than nothing. Currently
this affects only collations explicitly provided by "libc". More work
will be needed to handle the default collation.
Author: Thomas Munro, based on a suggestion from Christoph Berg
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
|
|
|
|
|
|
| |
This reverts commit 9f90b1d08d796a925808b24f77f624a0ff682c77.
This needs some refinements in the pg_dump and pg_upgrade tests.
|
|
|
|
|
|
|
|
|
| |
Using glibc's version number to detect potential collation definition
changes is not 100% reliable, but it's better than nothing.
Author: Thomas Munro
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of d9dd406fe281d22d5238d3c26a7182543c711e74, we require MSVC 2013,
which means _MSC_VER >= 1800. This means that conditionals about
older versions of _MSC_VER can be removed or simplified.
Previous code was also in some cases handling MinGW, where _MSC_VER is
not defined at all, incorrectly, such as in pg_ctl.c and win32_port.h,
leading to some compiler warnings. This should now be handled better.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
|
|
|
|
|
|
|
|
| |
This fixes various typos in docs and comments, and removes some orphaned
definitions.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com
|
|
|
|
|
|
|
|
|
| |
This addresses some issues with unnecessary code comments, fixes various
typos in docs and comments, and removes some orphaned structures and
definitions.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com
|
|
|
|
| |
Discussion: https://postgr.es/m/20190515183005.GA26486@alvherre.pgsql
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cache_locale_time (extraction of LC_TIME-related info) had never been
taught the lessons we previously learned about extraction of info related
to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught
PGLC_localeconv() that data coming out of localeconv() was in an encoding
determined by the relevant locale, but we didn't realize that there's a
similar issue with strftime(). And commit a4930e7ca hardened
PGLC_localeconv() against errors occurring partway through, but failed
to do likewise for cache_locale_time(). So, rearrange the latter
function to perform encoding conversion and not risk failure while
it's got the locales set to temporary values.
This time around I also changed PGLC_localeconv() to treat it as FATAL
if it can't restore the previous settings of the locale values. There
is no reason (except possibly OOM) for that to fail, and proceeding with
the wrong locale values seems like a seriously bad idea --- especially
on Windows where we have to also temporarily change LC_CTYPE. Also,
protect against the possibility that we can't identify the codeset
reported for LC_MONETARY or LC_NUMERIC; rather than just failing,
try to validate the data without conversion.
The user-visible symptom this fixes is that if LC_TIME is set to a locale
name that implies an encoding different from the database encoding,
non-ASCII localized day and month names would be retrieved in the wrong
encoding, leading to either unexpected encoding-conversion error reports
or wrong output from to_char(). The other possible failure modes are
unlikely enough that we've not seen reports of them, AFAIK.
The encoding conversion problems do not manifest on Windows, since
we'd already created special-case code to handle that issue there.
Per report from Juan José Santamaría Flecha. Back-patch to all
supported versions.
Juan José Santamaría Flecha and Tom Lane
Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a flag "deterministic" to collations. If that is false,
such a collation disables various optimizations that assume that
strings are equal only if they are byte-wise equal. That then allows
use cases such as case-insensitive or accent-insensitive comparisons
or handling of strings with different Unicode normal forms.
This functionality is only supported with the ICU provider. At least
glibc doesn't appear to have any locales that work in a
nondeterministic way, so it's not worth supporting this for the libc
provider.
The term "deterministic comparison" in this context is from Unicode
Technical Standard #10
(https://unicode.org/reports/tr10/#Deterministic_Comparison).
This patch makes changes in three areas:
- CREATE COLLATION DDL changes and system catalog changes to support
this new flag.
- Many executor nodes and auxiliary code are extended to track
collations. Previously, this code would just throw away collation
information, because the eventually-called user-defined functions
didn't use it since they only cared about equality, which didn't
need collation information.
- String data type functions that do equality comparisons and hashing
are changed to take the (non-)deterministic flag into account. For
comparison, this just means skipping various shortcuts and tie
breakers that use byte-wise comparison. For hashing, we first need
to convert the input string to a canonical "sort key" using the ICU
analogue of strxfrm().
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unrecognized attribute names are supposed to be ignored. But the code
would error out on an unrecognized attribute value even if it did not
recognize the attribute name. So unrecognized attributes wouldn't
really be ignored unless the value happened to be one that matched a
recognized value. This would break some important cases where the
attribute would be processed by ucol_open() directly. Fix that and
add a test case.
The restructured code should also avoid compiler warnings about
initializing a UColAttribute value to -1, because the type might be an
unsigned enum. (reported by Andres Freund)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting in ICU 54, collation customization attributes can be
specified in the locale string, for example
"@colStrength=primary;colCaseLevel=yes". Add support for this for
older ICU versions as well, by adding some minimal parsing of the
attributes in the locale string and calling ucol_setAttribute() on
them. This is essentially what never ICU versions do internally in
ucol_open(). This was we can offer this functionality in a consistent
way in all ICU versions supported by PostgreSQL.
Also add some tests for ICU collation customization.
Reported-by: Daniel Verite <daniel@manitou-mail.org>
Discussion: https://www.postgresql.org/message-id/0270ebd4-f67c-8774-1a5a-91adfb9bb41f@2ndquadrant.com
|
|
|
|
| |
Backpatch-through: certain files through 9.4
|
|
|
|
| |
Backpatch-through: certain files through 9.3
|