aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/varlena.c
Commit message (Collapse)AuthorAge
* Fix string_to_array() to correctly handle the case where there areTom Lane2006-10-07
| | | | | | | | | | | overlapping possible matches for the separator string, such as string_to_array('123xx456xxx789', 'xx'). Also, revise the logic of replace(), split_part(), and string_to_array() to avoid O(N^2) work from redundant searches and conversions to pg_wchar format when there are N matches to the separator string. Backpatched the full patch as far as 8.0. 7.4 also has the bug, but the code has diverged a lot, so I just went for a quick-and-dirty fix of the bug itself in that branch.
* Change the backend to reject strings containing invalidly-encoded multibyteTom Lane2006-05-21
| | | | | | | | | | | | | | | | | | | | characters in all cases. Formerly we mostly just threw warnings for invalid input, and failed to detect it at all if no encoding conversion was required. The tighter check is needed to defend against SQL-injection attacks as per CVE-2006-2313 (further details will be published after release). Embedded zero (null) bytes will be rejected as well. The checks are applied during input to the backend (receipt from client or COPY IN), so it no longer seems necessary to check in textin() and related routines; any string arriving at those functions will already have been validated. Conversion failure reporting (for characters with no equivalent in the destination encoding) has been cleaned up and made consistent while at it. Also, fix a few longstanding errors in little-used encoding conversion routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic, mic_to_euc_tw were all broken to varying extents. Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki for identifying the security issues.
* Adjust string comparison so that only bitwise-equal strings are consideredTom Lane2005-12-22
| | | | | | | | | | | | equal: if strcoll claims two strings are equal, check it with strcmp, and sort according to strcmp if not identical. This fixes inconsistent behavior under glibc's hu_HU locale, and probably under some other locales as well. Also, take advantage of the now-well-defined behavior to speed up texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise comparison and not bother with strcoll at all. NOTE: affected databases may need to REINDEX indexes on text columns to be sure they are self-consistent.
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
* Pgindent run for 8.0.Bruce Momjian2004-08-29
|
* Update copyright to 2004.Bruce Momjian2004-08-29
|
* Infrastructure for I/O of composite types: arrange for the I/O routinesTom Lane2004-06-06
| | | | | | | | | | of a composite type to get that type's OID as their second parameter, in place of typelem which is useless. The actual changes are mostly centralized in getTypeInputInfo and siblings, but I had to fix a few places that were fetching pg_type.typelem for themselves instead of using the lsyscache.c routines. Also, I renamed all the related variables from 'typelem' to 'typioparam' to discourage people from assuming that they necessarily contain array element types.
* Use the new List API function names throughout the backend, and disable theNeil Conway2004-05-30
| | | | | list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.
* Reimplement the linked list data structure used throughout the backend.Neil Conway2004-05-26
| | | | | | | | | | | | | | | | In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.
* Implement a solution to the 'Turkish locale downcases I incorrectly'Tom Lane2004-02-21
| | | | | | problem, per previous discussion. Make some additional changes to centralize the knowledge of just how identifier downcasing is done, in hopes of simplifying any future tweaking in this area.
* Micro-opt: replace calls likeNeil Conway2004-01-31
| | | | | | | appendStringInfo(buf, "%s", str); with appendStringInfoString(buf, str); as the latter form is slightly faster.
* Fix text_position to not scan past end of source string in multibyteTom Lane2004-01-31
| | | | | case, per report from Korea PostgreSQL Users' Group. Also do some cosmetic cleanup in nearby code.
* Make to_hex() behave portably on negative input values (treat them asTom Lane2003-12-19
| | | | unsigned integers). Per report from Jim Crate.
* Make PQescapeBytea and byteaout consistent with each other, andJoe Conway2003-11-30
| | | | | | octal escape all octets outside the range 0x20 to 0x7e. This fixes the problem pointed out by Sergey Yatskevich here: http://archives.postgresql.org/pgsql-bugs/2003-11/msg00140.php
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Message editing: remove gratuitous variations in message wording, standardizePeter Eisentraut2003-09-25
| | | | | terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
* Remove --enable-recode feature, since it's been broken by IPv6 changes,Tom Lane2003-08-04
| | | | and seems to have too few users to justify maintaining.
* Update copyrights to 2003.Bruce Momjian2003-08-04
|
* pgindent run.Bruce Momjian2003-08-04
|
* Error message editing in utils/adt. Again thanks to Joe Conway for doingTom Lane2003-07-27
| | | | the bulk of the heavy lifting ...
* Create real array comparison functions (that use the element datatype'sTom Lane2003-06-27
| | | | | | | | | | | | | | | | comparison functions), replacing the highly bogus bitwise array_eq. Create a btree index opclass for ANYARRAY --- it is now possible to create indexes on array columns. Arrange to cache the results of catalog lookups across multiple array operations, instead of repeating the lookups on every call. Add string_to_array and array_to_string functions. Remove singleton_array, array_accum, array_assign, and array_subscript functions, since these were for proof-of-concept and not intended to become supported functions. Minor adjustments to behavior in some corner cases with empty or zero-dimensional arrays. Joe Conway (with some editorializing by Tom Lane).
* Back out array mega-patch.Bruce Momjian2003-06-25
| | | | Joe Conway
* Array mega-patch.Bruce Momjian2003-06-24
| | | | Joe Conway
* Indexing support for pattern matching operations via separate operatorPeter Eisentraut2003-05-15
| | | | class when lc_collate is not C.
* Binary send/receive routines for a few basic datatypes --- enough forTom Lane2003-05-09
| | | | testing purposes.
* Infrastructure for upgraded error reporting mechanism. elog.c isTom Lane2003-04-24
| | | | | | | rewritten and the protocol is changed, but most elog calls are still elog calls. Also, we need to contemplate mechanisms for controlling all this functionality --- eg, how much stuff should appear in the postmaster log? And what API should libpq expose for it?
* This patch fixes a bunch of spelling mistakes in comments throughout theTom Lane2003-03-10
| | | | | | PostgreSQL source code. Neil Conway
* Attached are two small patches to expose md5 as a user function -- includingBruce Momjian2002-12-06
| | | | | | | | documentation and regression test mods. It seemed small and unobtrusive enough to not require a specific proposal on the hackers list -- but if not, let me know and I'll make a pitch. Otherwise, if there are no objections please apply. Joe Conway
* Reduce need for palloc/pfree overhead in varstr_cmp() by using fixed-sizeTom Lane2002-11-17
| | | | buffers on stack for short strings.
* pgindent run.Bruce Momjian2002-09-04
|
* Remove all traces of multibyte and locale options. Clean up commentsPeter Eisentraut2002-09-03
| | | | referring to "multibyte" where it really means character encoding.
* Remove #ifdef MULTIBYTE per hackers list discussion.Tatsuo Ishii2002-08-29
|
* backend where a statically sized buffer is written to. Most of theseBruce Momjian2002-08-28
| | | | | | | | | | | should be pretty safe in practice, but it's probably better to be safe than sorry. I was actually looking for cases where NAMEDATALEN is assumed to be 32, but only found one. That's fixed too, as well as a few bits of code cleanup. Neil Conway
* Add:Bruce Momjian2002-08-22
| | | | | | | | | | | replace(string, from, to) -- replaces all occurrences of "from" in "string" to "to" split(string, fldsep, column) -- splits "string" on "fldsep" and returns "column" number piece to_hex(int32_num) & to_hex(int64_num) -- takes integer number and returns as hex string Joe Conway
* Add guard code to protect from buffer overruns on long date/time inputThomas G. Lockhart2002-08-04
| | | | | | | | | | | | | | strings. Should go back in and look at doing this a bit more elegantly and (hopefully) cheaper. Probably not too bad anyway, but it seems a shame to scan the strings twice: once for length for this buffer overrun protection, and once to parse the line. Remove use of pow() in date/time handling; was already gone from everything *but* the time data types. Define macros for handling typmod manipulation for date/time types. Should be more robust than all of that brute-force inline code. Rename macros for masking and typmod manipulation to put TIMESTAMP_ or INTERVAL_ in front of the macro name, to reduce the possibility of name space collisions.
* Update copyright to 2002.Bruce Momjian2002-06-20
|
* Implement types regprocedure, regoper, regoperator, regclass, regtypeTom Lane2002-04-25
| | | | | | | per pghackers discussion. Add some more typsanity tests, and clean up some problems exposed thereby (broken or missing array types for some built-in types). Also, clean up loose ends from unknownin/out patch.
* Here's a patch to add unknownin/unknownout support. I also poked aroundBruce Momjian2002-04-24
| | | | | | | | | | looking for places that assume UNKNOWN == TEXT. One of those was the "SET" type in pg_type.h, which was using textin/textout. This one I took care of in this patch. The other suspicious place was in string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I wasn't too sure about those, so I left them be. Joe Conway
* Fix text_substr bug intrduced in 7.3 developmentTatsuo Ishii2002-04-15
| | | | | using Joe Conway's patches (submitted at pgsql-patches on 2002/04/08) + small fix.
* Locale support is on by default. The choice of locale is done in initdbPeter Eisentraut2002-04-03
| | | | and/or with GUC variables.
* Create a new GUC variable search_path to control the namespace searchTom Lane2002-04-01
| | | | | | | path. The default behavior if no per-user schemas are created is that all users share a 'public' namespace, thus providing behavior backwards compatible with 7.2 and earlier releases. Probably the semantics and default setting will need to be fine-tuned, but this is a start.
* Further cleanups for relations in schemas: teach nextval and otherTom Lane2002-03-30
| | | | | | sequence functions how to cope with qualified names. Same code is also used for int4notin, currtid_byrelname, pgstattuple. Also, move TOAST tables into special pg_toast namespace.
* I attach a version of my toast-slicing patch, against current CVSBruce Momjian2002-03-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (current as of a few hours ago.) This patch: 1. Adds PG_GETARG_xxx_P_SLICE() macros and associated support routines. 2. Adds routines in src/backend/access/tuptoaster.c for fetching only necessary chunks of a toasted value. (Modelled on latest changes to assume chunks are returned in order). 3. Amends text_substr and bytea_substr to use new methods. It now handles multibyte cases -and should still lead to a performance improvement in the multibyte case where the substring is near the beginning of the string. 4. Added new command: ALTER TABLE tabname ALTER COLUMN colname SET STORAGE {PLAIN | EXTERNAL | EXTENDED | MAIN} to parser and documented in alter-table.sgml. (NB I used ColId as the item type for the storage mode string, rather than a new production - I hope this makes sense!). All this does is sets attstorage for the specified column. 4. AlterTableAlterColumnStatistics is now AlterTableAlterColumnFlags and handles both statistics and storage (it uses the subtype code to distinguish). The previous version of my patch also re-arranged other code in backend/commands/command.c but I have dropped that from this patch.(I plan to return to it separately). 5. Documented new macros (and also the PG_GETARG_xxx_P_COPY macros) in xfunc.sgml. ref/alter_table.sgml also contains documentation for ALTER COLUMN SET STORAGE. John Gray
* Fix arg coerect match text type, per Tom.Bruce Momjian2001-11-19
|
* Make text octet_length() return non-compressed length to be consistentBruce Momjian2001-11-19
| | | | with other data types, per disucssion. Encoding issue still open.
* Grammatical and spelling fixes.Tom Lane2001-11-19
|
* Optimization for bpcharlen, textlen, varcharlen in case of single byteTatsuo Ishii2001-11-18
| | | | encodings.
* pgindent run on all C files. Java run to follow. initdb/regressionBruce Momjian2001-10-25
| | | | tests pass.
* > Here's a revised patch. Changes:Bruce Momjian2001-09-14
| | | | | | | | | | | | | | | | | | | | | | | | > > 1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape') > Note that I ended up leaving \0 as \000 so that there are no ambiguities > when decoding something like, for example, \0123. > > 2. Fixed bug in byteain which allowed input values which were not valid > octals (e.g. \789), to be parsed as if they were octals. > > Joe > Here's rev 2 of the bytea string support patch. Changes: 1. Added missing declaration for MatchBytea function 2. Added PQescapeBytea to fe-exec.c 3. Applies cleanly on cvs tip from this afternoon I'm hoping that someone can review/approve/apply this before beta starts, so I guess I'd vote (not that it counts for much) to delay beta a few days :-) Joe Conway
* Implement following item in TODO:Tatsuo Ishii2001-09-11
| | | | * Reject character sequences those are not valid in their charset