aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/mb/mbutils.c
Commit message (Collapse)AuthorAge
* Change the backend to reject strings containing invalidly-encoded multibyteTom Lane2006-05-21
| | | | | | | | | | | | | | | | | | | | characters in all cases. Formerly we mostly just threw warnings for invalid input, and failed to detect it at all if no encoding conversion was required. The tighter check is needed to defend against SQL-injection attacks as per CVE-2006-2313 (further details will be published after release). Embedded zero (null) bytes will be rejected as well. The checks are applied during input to the backend (receipt from client or COPY IN), so it no longer seems necessary to check in textin() and related routines; any string arriving at those functions will already have been validated. Conversion failure reporting (for characters with no equivalent in the destination encoding) has been cleaned up and made consistent while at it. Also, fix a few longstanding errors in little-used encoding conversion routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic, mic_to_euc_tw were all broken to varying extents. Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki for identifying the security issues.
* mbutils was previously doing some allocations, including invokingNeil Conway2006-01-12
| | | | | | | | | | fmgr_info(), in the TopMemoryContext. I couldn't see that the code actually leaked, but in general I think it's fragile to assume that pfree'ing an FmgrInfo along with its fn_extra field is enough to reclaim all the resources allocated by fmgr_info(). I changed the code to do its allocations in a new child context of TopMemoryContext, MbProcContext. When we want to release the allocations we can just reset the context, which is cleaner.
* Cosmetic code cleanup: fix a bunch of places that used "return (expr);"Neil Conway2006-01-11
| | | | | | rather than "return expr;" -- the latter style is used in most of the tree. I kept the parentheses when they were necessary or useful because the return expression was complex.
* Remove a confusing pair of parentheses.Neil Conway2006-01-11
|
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* Suppress signed-vs-unsigned-char warnings.Tom Lane2005-09-24
|
* Change typreceive function API so that receive functions get the sameTom Lane2005-07-10
| | | | | | | optional arguments as text input functions, ie, typioparam OID and atttypmod. Make all the datatypes that use typmod enforce it the same way in typreceive as they do in typinput. This fixes a problem with failure to enforce length restrictions during COPY FROM BINARY.
* Rename canonical encodings, per Peter:Bruce Momjian2005-03-07
| | | | | | | | | UNICODE => UTF8 ALT => WIN866 WIN => WIN1251 TCVN => WIN1258 The old codes continue to work.
* More minor cosmetic improvements:Neil Conway2004-10-13
| | | | | | | | | | - remove another senseless "extern" keyword that was applied to a function definition - change a foo more function signatures from "some_type foo()" to "some_type foo(void)" - rewrite another K&R style function definition - make the type of the "action" function pointer in the KeyWord struct in src/backend/utils/adt/formatting.c more precise
* Pgindent run for 8.0.Bruce Momjian2004-08-29
|
* Add PQmbdsplen() which returns the "display length" of a character.Tatsuo Ishii2004-03-15
| | | | | Still some works needed: - UTF-8, MULE_INTERNAL always returns 1
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Message editing: remove gratuitous variations in message wording, standardizePeter Eisentraut2003-09-25
| | | | | terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
* pgindent run.Bruce Momjian2003-08-04
|
* Error message editing in backend/utils (except /adt).Tom Lane2003-07-25
|
* Department of second thoughts: probably still need an IsTransactionStateTom Lane2003-04-27
| | | | test in there...
* Clean up some problems in SetClientEncoding: failed to honor doit flagTom Lane2003-04-27
| | | | | | | in all cases, leaked TopMemoryContext memory in others. Make the interaction between SetClientEncoding and InitializeClientEncoding cleaner and better documented. I suspect these changes should be back-patched into 7.3, but will wait on Tatsuo's verification.
* This patch fixes a bunch of spelling mistakes in comments throughout theTom Lane2003-03-10
| | | | | | PostgreSQL source code. Neil Conway
* Fix for GUC client_encoding variable not being handledTatsuo Ishii2003-02-19
| | | | | | | | correctly. See following thread for more details. Subject: [HACKERS] client_encoding directive is ignored in postgresql.conf From: Tatsuo Ishii <t-ishii@sra.co.jp> Date: Wed, 29 Jan 2003 22:24:04 +0900 (JST)
* Guard against 0 length string encoding conversion case.Tatsuo Ishii2002-11-26
|
* Remove encoding lookups from grammar stage, push them back to placesTom Lane2002-11-02
| | | | | | where it's safe to do database access. Along the way, fix core dump for 'DEFAULT' parameters to CREATE DATABASE. initdb forced due to change in pg_proc entry.
* pgindent run.Bruce Momjian2002-09-04
|
* Remove all traces of multibyte and locale options. Clean up commentsPeter Eisentraut2002-09-03
| | | | referring to "multibyte" where it really means character encoding.
* Remove #ifdef MULTIBYTE per hackers list discussion.Tatsuo Ishii2002-08-29
|
* Fix bug in pg_convert() per report from MaC.Yui.Tatsuo Ishii2002-08-19
| | | | It pfree() wrong pointer.
* Fix memory leak in SetClientEncoding().Tatsuo Ishii2002-08-14
|
* Load and keep conversion function info when SET CLIENT_ENCODING TO isTatsuo Ishii2002-08-08
| | | | | executed to prevent database access while performing encoding conversion.
* Implement DROP CONVERSIONTatsuo Ishii2002-07-25
| | | | Add regression test
* I have committed many support files for CREATE CONVERSION. DefaultTatsuo Ishii2002-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | conversion procs and conversions are added in initdb. Currently supported conversions are: UTF-8(UNICODE) <--> SQL_ASCII, ISO-8859-1 to 16, EUC_JP, EUC_KR, EUC_CN, EUC_TW, SJIS, BIG5, GBK, GB18030, UHC, JOHAB, TCVN EUC_JP <--> SJIS EUC_TW <--> BIG5 MULE_INTERNAL <--> EUC_JP, SJIS, EUC_TW, BIG5 Note that initial contents of pg_conversion system catalog are created in the initdb process. So doing initdb required is ideal, it's possible to add them to your databases by hand, however. To accomplish this: psql -f your_postgresql_install_path/share/conversion_create.sql your_database So I did not bump up the version in cataversion.h. TODO: Add more conversion procs Add [CASCADE|RESTRICT] to DROP CONVERSION Add tuples to pg_depend Add regression tests Write docs Add SQL99 CONVERT command? -- Tatsuo Ishii
* Simplify pg_convert() in that it calls pg_convert2 using new fmgr interface.Tatsuo Ishii2001-11-20
|
* Fix nasty bugs in pg_convert() and pg_convert2().Tatsuo Ishii2001-11-19
| | | | | | o they sometimes returns a result garbage string appended. o they do not work if client encoding is different from server encoding
* pgindent run on all C files. Java run to follow. initdb/regressionBruce Momjian2001-10-25
| | | | tests pass.
* Add a new function "pg_client_encoding" which returns the current clientTatsuo Ishii2001-10-12
| | | | | | side encoding name. This is necessary for client API's such as JDBC to perform correct encoding conversions. See my email "[HACKERS] pg_client_encoding" 10 Sep 2001.
* Fix type_maximum_size() to give the right answer in MULTIBYTE cases.Tom Lane2001-09-21
| | | | Avoid use of prototype-less function pointers in MB code.
* Backout Karel's patchTatsuo Ishii2001-09-09
|
* > > A simple and robus solution is in the begin of mbutils.c set defaultBruce Momjian2001-09-08
| | | | | | | | | | > > ClientEncoding to SQL_ASCII (like default DatabaseEncoding). Bruce, can > > you change it? It's one line change. Again thanks. Forget it! A default client encoding must be set by actual database encoding... Please apply the small attached patch that solve it better. Karel Zak
* Commit Karel's patch.Tatsuo Ishii2001-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------- Subject: Re: [PATCHES] encoding names From: Karel Zak <zakkr@zf.jcu.cz> To: Peter Eisentraut <peter_e@gmx.net> Cc: pgsql-patches <pgsql-patches@postgresql.org> Date: Fri, 31 Aug 2001 17:24:38 +0200 On Thu, Aug 30, 2001 at 01:30:40AM +0200, Peter Eisentraut wrote: > > - convert encoding 'name' to 'id' > > I thought we decided not to add functions returning "new" names until we > know exactly what the new names should be, and pending schema Ok, the patch not to add functions. > better > > ...(): encoding name too long Fixed. I found new bug in command/variable.c in parse_client_encoding(), nobody probably never see this error: if (pg_set_client_encoding(encoding)) { elog(ERROR, "Conversion between %s and %s is not supported", value, GetDatabaseEncodingName()); } because pg_set_client_encoding() returns -1 for error and 0 as true. It's fixed too. IMHO it can be apply. Karel PS: * following files are renamed: src/utils/mb/Unicode/KOI8_to_utf8.map --> src/utils/mb/Unicode/koi8r_to_utf8.map src/utils/mb/Unicode/WIN_to_utf8.map --> src/utils/mb/Unicode/win1251_to_utf8.map src/utils/mb/Unicode/utf8_to_KOI8.map --> src/utils/mb/Unicode/utf8_to_koi8r.map src/utils/mb/Unicode/utf8_to_WIN.map --> src/utils/mb/Unicode/utf8_to_win1251.map * new file: src/utils/mb/encname.c * removed file: src/utils/mb/common.c -- Karel Zak <zakkr@zf.jcu.cz> http://home.zf.jcu.cz/~zakkr/ C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
* Add conver/convert2 functions. They are similar to the SQL99's convert.Tatsuo Ishii2001-08-15
|
* TODO item:Tatsuo Ishii2001-07-15
| | | | * Make n of CHAR(n)/VARCHAR(n) the number of letters, not bytes
* getdatabaseencoding() and PG_encoding_to_char() were being sloppy aboutTom Lane2001-04-16
| | | | | converting char* strings to type 'name'. Imagine my surprise when 7.1 release coredumped upon start when compiled --enable-multibyte ...
* Modify wchar conversion routines to not fetch the next byte past the endTom Lane2001-03-08
| | | | | | | | | | | | | of a counted input string. Marinos Yannikos' recent crash report turns out to be due to applying pg_ascii2wchar_with_len to a TEXT object that is smack up against the end of memory. This is the second just-barely- reproducible bug report I have seen that traces to some bit of code fetching one more byte than it is allowed to. Let's be more careful out there, boys and girls. While at it, I changed the code to not risk a similar crash when there is a truncated multibyte character at the end of an input string. The output in this case might not be the most reasonable output possible; if anyone wants to improve it further, step right up...
* Restructure the key include files per recent pghackers discussion: thereTom Lane2001-02-10
| | | | | | | | | | | are now separate files "postgres.h" and "postgres_fe.h", which are meant to be the primary include files for backend .c files and frontend .c files respectively. By default, only include files meant for frontend use are installed into the installation include directory. There is a new make target 'make install-all-headers' that adds the whole content of the src/include tree to the installed fileset, for use by people who want to develop server-side code without keeping the complete source tree on hand. Cleaned up a whole lot of crufty and inconsistent header inclusions.
* Extend CREATE DATABASE to allow selection of a template database to beTom Lane2000-11-14
| | | | | | | | | | cloned, rather than always cloning template1. Modify initdb to generate two identical databases rather than one, template0 and template1. Connections to template0 are disallowed, so that it will always remain in its virgin as-initdb'd state. pg_dumpall now dumps databases with restore commands that say CREATE DATABASE foo WITH TEMPLATE = template0. This allows proper behavior when there is user-added data in template1. initdb forced!
* Add support for code conversion between Unicode and other encodings.Tatsuo Ishii2000-10-30
| | | | | | Supported encodings are: EUC_JP, EUC_CN, EUC_KR, EUC_TW, Shift JIS, Big5, ISO8859-[1-5]. TODO: testings! and documentations...
* Support for conversion between UNICODE and other encodingsTatsuo Ishii2000-10-12
| | | | | currently ISO8859-[1-5] and EUC_JP are supported. support for other encodings will be coming soon.
* Change pg_mblen and pg_encoding_mblen return types from voidTatsuo Ishii2000-08-27
| | | | to int so that they return the number of whcars.
* Another batch of fmgr updates. I think I have gotten all old-styleTom Lane2000-06-13
| | | | | functions that take pass-by-value datatypes. Should be ready for port testing ...
* Eliminate query length limitation imposed by pg_client_to_serverTom Lane1999-09-11
| | | | | and pg_server_to_client. Eliminate copy.c's restriction on the length of a single attribute.
* Move some system includes into c.h, and remove duplicates.Bruce Momjian1999-07-17
|
* Fix for multi-byte includes.Bruce Momjian1999-07-17
|