diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2020-03-30 11:14:58 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2020-03-30 11:14:58 -0400 |
commit | de5b9db3644d5262940c72bab174b43f0bc696cd (patch) | |
tree | 1179f61175b574be954b389718ffdff619175644 /src | |
parent | 756781cc8b6b9c3edb73dc617048605454c4103e (diff) | |
download | postgresql-de5b9db3644d5262940c72bab174b43f0bc696cd.tar.gz postgresql-de5b9db3644d5262940c72bab174b43f0bc696cd.zip |
Be more careful about extracting encoding from locale strings on Windows.
GetLocaleInfoEx() can fail on strings that setlocale() was perfectly
happy with. A common way for that to happen is if the locale string
is actually a Unix-style string, say "et_EE.UTF-8". In that case,
what's after the dot is an encoding name, not a Windows codepage number;
blindly treating it as a codepage number led to failure, with a fairly
silly error message. Hence, check to see if what's after the dot is
all digits, and if not, treat it as a literal encoding name rather than
a codepage number. This will do the right thing with many Unix-style
locale strings, and produce a more sensible error message otherwise.
Somewhat independently of that, treat a zero (CP_ACP) result from
GetLocaleInfoEx() as meaning that we must use UTF-8 encoding.
Back-patch to all supported branches.
Juan José SantamarÃa Flecha
Discussion: https://postgr.es/m/24905.1585445371@sss.pgh.pa.us
Diffstat (limited to 'src')
-rw-r--r-- | src/port/chklocale.c | 29 |
1 files changed, 24 insertions, 5 deletions
diff --git a/src/port/chklocale.c b/src/port/chklocale.c index 9b753c85e97..5623936d41e 100644 --- a/src/port/chklocale.c +++ b/src/port/chklocale.c @@ -239,25 +239,44 @@ win32_langinfo(const char *ctype) { r = malloc(16); /* excess */ if (r != NULL) - sprintf(r, "CP%u", cp); + { + /* + * If the return value is CP_ACP that means no ANSI code page is + * available, so only Unicode can be used for the locale. + */ + if (cp == CP_ACP) + strcpy(r, "utf8"); + else + sprintf(r, "CP%u", cp); + } } else #endif { /* - * Locale format on Win32 is <Language>_<Country>.<CodePage> . For - * example, English_United States.1252. + * Locale format on Win32 is <Language>_<Country>.<CodePage>. For + * example, English_United States.1252. If we see digits after the + * last dot, assume it's a codepage number. Otherwise, we might be + * dealing with a Unix-style locale string; Windows' setlocale() will + * take those even though GetLocaleInfoEx() won't, so we end up here. + * In that case, just return what's after the last dot and hope we can + * find it in our table. */ codepage = strrchr(ctype, '.'); if (codepage != NULL) { - int ln; + size_t ln; codepage++; ln = strlen(codepage); r = malloc(ln + 3); if (r != NULL) - sprintf(r, "CP%s", codepage); + { + if (strspn(codepage, "0123456789") == ln) + sprintf(r, "CP%s", codepage); + else + strcpy(r, codepage); + } } } |