| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
characters in all cases. Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release). Embedded
zero (null) bytes will be rejected as well. The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated. Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.
Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.
Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
|
|
|
|
|
|
|
|
|
|
| |
fmgr_info(), in the TopMemoryContext. I couldn't see that the code
actually leaked, but in general I think it's fragile to assume that
pfree'ing an FmgrInfo along with its fn_extra field is enough to
reclaim all the resources allocated by fmgr_info(). I changed the
code to do its allocations in a new child context of
TopMemoryContext, MbProcContext. When we want to release the
allocations we can just reset the context, which is cleaner.
|
|
|
|
|
|
| |
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
optional arguments as text input functions, ie, typioparam OID and
atttypmod. Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput. This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
|
|
|
|
|
|
|
|
|
| |
UNICODE => UTF8
ALT => WIN866
WIN => WIN1251
TCVN => WIN1258
The old codes continue to work.
|
|
|
|
|
|
|
|
|
|
| |
- remove another senseless "extern" keyword that was applied to a
function definition
- change a foo more function signatures from "some_type foo()" to
"some_type foo(void)"
- rewrite another K&R style function definition
- make the type of the "action" function pointer in the KeyWord struct
in src/backend/utils/adt/formatting.c more precise
|
| |
|
|
|
|
|
| |
Still some works needed:
- UTF-8, MULE_INTERNAL always returns 1
|
| |
|
|
|
|
|
| |
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
|
| |
|
| |
|
|
|
|
| |
test in there...
|
|
|
|
|
|
|
| |
in all cases, leaked TopMemoryContext memory in others. Make the
interaction between SetClientEncoding and InitializeClientEncoding
cleaner and better documented. I suspect these changes should be
back-patched into 7.3, but will wait on Tatsuo's verification.
|
|
|
|
|
|
| |
PostgreSQL source code.
Neil Conway
|
|
|
|
|
|
|
|
| |
correctly. See following thread for more details.
Subject: [HACKERS] client_encoding directive is ignored in postgresql.conf
From: Tatsuo Ishii <t-ishii@sra.co.jp>
Date: Wed, 29 Jan 2003 22:24:04 +0900 (JST)
|
| |
|
|
|
|
|
|
| |
where it's safe to do database access. Along the way, fix core dump
for 'DEFAULT' parameters to CREATE DATABASE. initdb forced due to
change in pg_proc entry.
|
| |
|
|
|
|
| |
referring to "multibyte" where it really means character encoding.
|
| |
|
|
|
|
| |
It pfree() wrong pointer.
|
| |
|
|
|
|
|
| |
executed to prevent database access while performing encoding
conversion.
|
|
|
|
| |
Add regression test
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conversion procs and conversions are added in initdb. Currently
supported conversions are:
UTF-8(UNICODE) <--> SQL_ASCII, ISO-8859-1 to 16, EUC_JP, EUC_KR,
EUC_CN, EUC_TW, SJIS, BIG5, GBK, GB18030, UHC,
JOHAB, TCVN
EUC_JP <--> SJIS
EUC_TW <--> BIG5
MULE_INTERNAL <--> EUC_JP, SJIS, EUC_TW, BIG5
Note that initial contents of pg_conversion system catalog are created
in the initdb process. So doing initdb required is ideal, it's
possible to add them to your databases by hand, however. To accomplish
this:
psql -f your_postgresql_install_path/share/conversion_create.sql your_database
So I did not bump up the version in cataversion.h.
TODO:
Add more conversion procs
Add [CASCADE|RESTRICT] to DROP CONVERSION
Add tuples to pg_depend
Add regression tests
Write docs
Add SQL99 CONVERT command?
--
Tatsuo Ishii
|
| |
|
|
|
|
|
|
| |
o they sometimes returns a result garbage string appended.
o they do not work if client encoding is different from server
encoding
|
|
|
|
| |
tests pass.
|
|
|
|
|
|
| |
side encoding name. This is necessary for client API's such as JDBC
to perform correct encoding conversions. See my email "[HACKERS]
pg_client_encoding" 10 Sep 2001.
|
|
|
|
| |
Avoid use of prototype-less function pointers in MB code.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
> > ClientEncoding to SQL_ASCII (like default DatabaseEncoding). Bruce, can
> > you change it? It's one line change. Again thanks.
Forget it! A default client encoding must be set by actual database encoding...
Please apply the small attached patch that solve it better.
Karel Zak
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-------------------------------------------------------------------
Subject: Re: [PATCHES] encoding names
From: Karel Zak <zakkr@zf.jcu.cz>
To: Peter Eisentraut <peter_e@gmx.net>
Cc: pgsql-patches <pgsql-patches@postgresql.org>
Date: Fri, 31 Aug 2001 17:24:38 +0200
On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote:
> > - convert encoding 'name' to 'id'
>
> I thought we decided not to add functions returning "new" names until we
> know exactly what the new names should be, and pending schema
Ok, the patch not to add functions.
> better
>
> ...(): encoding name too long
Fixed.
I found new bug in command/variable.c in parse_client_encoding(), nobody
probably never see this error:
if (pg_set_client_encoding(encoding))
{
elog(ERROR, "Conversion between %s and %s is not supported",
value, GetDatabaseEncodingName());
}
because pg_set_client_encoding() returns -1 for error and 0 as true.
It's fixed too.
IMHO it can be apply.
Karel
PS:
* following files are renamed:
src/utils/mb/Unicode/KOI8_to_utf8.map -->
src/utils/mb/Unicode/koi8r_to_utf8.map
src/utils/mb/Unicode/WIN_to_utf8.map -->
src/utils/mb/Unicode/win1251_to_utf8.map
src/utils/mb/Unicode/utf8_to_KOI8.map -->
src/utils/mb/Unicode/utf8_to_koi8r.map
src/utils/mb/Unicode/utf8_to_WIN.map -->
src/utils/mb/Unicode/utf8_to_win1251.map
* new file:
src/utils/mb/encname.c
* removed file:
src/utils/mb/common.c
--
Karel Zak <zakkr@zf.jcu.cz>
http://home.zf.jcu.cz/~zakkr/
C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
|
| |
|
|
|
|
| |
* Make n of CHAR(n)/VARCHAR(n) the number of letters, not bytes
|
|
|
|
|
| |
converting char* strings to type 'name'. Imagine my surprise when 7.1
release coredumped upon start when compiled --enable-multibyte ...
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of a counted input string. Marinos Yannikos' recent crash report turns
out to be due to applying pg_ascii2wchar_with_len to a TEXT object that
is smack up against the end of memory. This is the second just-barely-
reproducible bug report I have seen that traces to some bit of code
fetching one more byte than it is allowed to. Let's be more careful
out there, boys and girls.
While at it, I changed the code to not risk a similar crash when there
is a truncated multibyte character at the end of an input string. The
output in this case might not be the most reasonable output possible;
if anyone wants to improve it further, step right up...
|
|
|
|
|
|
|
|
|
|
|
| |
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively. By default, only include files meant for frontend use are
installed into the installation include directory. There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.
|
|
|
|
|
|
|
|
|
|
| |
cloned, rather than always cloning template1. Modify initdb to generate
two identical databases rather than one, template0 and template1.
Connections to template0 are disallowed, so that it will always remain
in its virgin as-initdb'd state. pg_dumpall now dumps databases with
restore commands that say CREATE DATABASE foo WITH TEMPLATE = template0.
This allows proper behavior when there is user-added data in template1.
initdb forced!
|
|
|
|
|
|
| |
Supported encodings are: EUC_JP, EUC_CN, EUC_KR, EUC_TW, Shift JIS,
Big5, ISO8859-[1-5].
TODO: testings! and documentations...
|
|
|
|
|
| |
currently ISO8859-[1-5] and EUC_JP are supported.
support for other encodings will be coming soon.
|
|
|
|
| |
to int so that they return the number of whcars.
|
|
|
|
|
| |
functions that take pass-by-value datatypes. Should be ready for
port testing ...
|
|
|
|
|
| |
and pg_server_to_client. Eliminate copy.c's restriction on the length
of a single attribute.
|
| |
|
| |
|