aboutsummaryrefslogtreecommitdiff
path: root/src/backend/commands/copy.c
Commit message (Collapse)AuthorAge
...
* Swap the order of testing for control characters and for column delimiter inTom Lane2007-12-27
| | | | | | | | CopyAttributeOutText(), so that control characters are converted to the C-style escape sequences even if they happen to be equal to the column delimiter (as is true by default for tab, for example). Oversight in my previous patch to restore pre-8.3 behavior of COPY OUT escaping. Per report from Tomas Szepe.
* Revert COPY OUT to follow the pre-8.3 handling of ASCII control characters,Tom Lane2007-12-03
| | | | | | | | | | namely that \r, \n, \t, \b, \f, \v are dumped as those two-character representations rather than a backslash and the literal control character. I had made it do the other to save some code, but this was ill-advised, because dump files in which these characters appear literally are prone to newline mangling. Fortunately, doing it the old way should only cost a few more lines of code, and not slow down the copy loop materially. Per bug #3795 from Lou Duchez.
* Avoid incrementing the CommandCounter when CommandCounterIncrement is calledTom Lane2007-11-30
| | | | | | | | | | | | | | | | | | | | but no database changes have been made since the last CommandCounterIncrement. This should result in a significant improvement in the number of "commands" that can typically be performed within a transaction before hitting the 2^32 CommandId size limit. In particular this buys back (and more) the possible adverse consequences of my previous patch to fix plan caching behavior. The implementation requires tracking whether the current CommandCounter value has been "used" to mark any tuples. CommandCounter values stored into snapshots are presumed not to be used for this purpose. This requires some small executor changes, since the executor used to conflate the curcid of the snapshot it was using with the command ID to mark output tuples with. Separating these concepts allows some small simplifications in executor APIs. Something for the TODO list: look into having CommandCounterIncrement not do AcceptInvalidationMessages. It seems fairly bogus to be doing it there, but exactly where to do it instead isn't clear, and I'm disinclined to mess with asynchronous behavior during late beta.
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Perform post-escaping encoding validity checks on SQL literals and COPY inputAndrew Dunstan2007-09-12
| | | | so that invalidly encoded data cannot enter the database by these means.
* Don't take ProcArrayLock while exiting a transaction that has no XID; there isTom Lane2007-09-07
| | | | | | | | | | no need for serialization against snapshot-taking because the xact doesn't affect anyone else's snapshot anyway. Per discussion. Also, move various info about the interlocking of transactions and snapshots out of code comments and into a hopefully-more-cohesive discussion in access/transam/README. Also, remove a couple of now-obsolete comments about having to force some WAL to be written to persuade RecordTransactionCommit to do its thing.
* Minor code cleanup: calling FreeFile() before ereport(ERROR) is notNeil Conway2007-06-20
| | | | | necessary, since files opened via AllocateFile() are closed automatically as part of error recovery.
* Marginal hacking to improve the speed of COPY OUT. I had found in a bit ofTom Lane2007-06-17
| | | | | | | | | | | | | profiling that CopyAttributeOutText was taking an unreasonable fraction of the backend run time (like 66%!) on the following trivial test case: $ time psql -c "copy (select repeat('xyzzy',50) from generate_series(1,10000000)) to stdout" regression >/dev/null The time is all being spent on scanning the string for characters to be escaped, which most of the time there aren't any of. Some tweaking to take as many tests as possible out of the inner loop reduced the runtime of this example by more than 10%. In a real-world case it wouldn't be as useful a speedup, but it still seems worth adding a few lines here.
* Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve theTom Lane2007-04-27
| | | | | | | | | | | | | | types of unspecified parameters when submitted via extended query protocol. This worked in 8.2 but I had broken it during plancache changes. DECLARE CURSOR is now treated almost exactly like a plain SELECT through parse analysis, rewrite, and planning; only just before sending to the executor do we divert it away to ProcessUtility. This requires a special-case check in a number of places, but practically all of them were already special-casing SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge the two by treating IntoClause as a form of utility statement? Not going to worry about that now, though.) That approach doesn't work for EXPLAIN, however, so for that I punted and used a klugy solution of running parse analysis an extra time if under extended query protocol.
* Update docs/error message for CSV quote/escape --- must be ASCII.Bruce Momjian2007-04-18
| | | | Backpatch doc change to 8.2.X.
* Update error message for COPY with a multi-byte delimiter.Bruce Momjian2007-04-18
|
* Expose more cursor-related functionality in SPI: specifically, allowTom Lane2007-04-16
| | | | | | | | | | | access to the planner's cursor-related planning options, and provide new FETCH/MOVE routines that allow access to the full power of those commands. Small refactoring of planner(), pg_plan_query(), and pg_plan_queries() APIs to make it convenient to pass the planning options down from SPI. This is the core-code portion of Pavel Stehule's patch for scrollable cursor support in plpgsql; I'll review and apply the plpgsql changes separately.
* Teach CLUSTER to skip writing WAL if not needed (ie, not using archiving)Tom Lane2007-03-29
| | | | | --- Simon. Also, code review and cleanup for the previous COPY-no-WAL patches --- Tom.
* First phase of plan-invalidation project: create a plan cache managementTom Lane2007-03-13
| | | | | | | | | | | | | | | | module and teach PREPARE and protocol-level prepared statements to use it. In service of this, rearrange utility-statement processing so that parse analysis does not assume table schemas can't change before execution for utility statements (necessary because we don't attempt to re-acquire locks for utility statements when reusing a stored plan). This requires some refactoring of the ProcessUtility API, but it ends up cleaner anyway, for instance we can get rid of the QueryContext global. Still to do: fix up SPI and related code to use the plan cache; I'm tempted to try to make SQL functions use it too. Also, there are at least some aspects of system state that we want to ensure remain the same during a replan as in the original processing; search_path certainly ought to behave that way for instance, and perhaps there are others.
* Add resetStringInfo(), which clears the content of a StringInfo, andNeil Conway2007-03-03
| | | | | | fixup various places in the tree that were clearing a StringInfo by hand. Making this function a part of the API simplifies client code slightly, and avoids needlessly peeking inside the StringInfo interface.
* Remove the Query structure from the executor's API. This allows us to stopTom Lane2007-02-20
| | | | | | | | | | | | | | | storing mostly-redundant Query trees in prepared statements, portals, etc. To replace Query, a new node type called PlannedStmt is inserted by the planner at the top of a completed plan tree; this carries just the fields of Query that are still needed at runtime. The statement lists kept in portals etc. now consist of intermixed PlannedStmt and bare utility-statement nodes --- no Query. This incidentally allows us to remove some fields from Query and Plan nodes that shouldn't have been there in the first place. Still to do: simplify the execution-time range table; at the moment the range table passed to the executor still contains Query trees for subqueries. initdb forced due to change of stored rules.
* Prevent WAL logging when COPY is done in the same transation thatBruce Momjian2007-01-25
| | | | | | created it. Simon Riggs
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Message style improvementsPeter Eisentraut2006-10-06
|
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Attibution addition: Add Karel Zak also for COPY SELECT.Bruce Momjian2006-08-31
|
* Correct attibution:Bruce Momjian2006-08-31
| | | | COPY SELECT work done by Zoltan Boszormenyi
* Extend COPY to support COPY (SELECT ...) TO ...Tom Lane2006-08-30
| | | | Bernd Helmle
* Remove 576 references of include files that were not needed.Bruce Momjian2006-07-14
|
* Allow include files to compile own their own.Bruce Momjian2006-07-13
| | | | | | | Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.
* Further hacking on performance of COPY OUT. It seems that fwrite()'sTom Lane2006-05-26
| | | | | | per-call overhead is quite significant, at least on Linux: whatever it's doing is more than just shoving the bytes into a buffer. Buffering the data so we can call fwrite() just once per row seems to be a win.
* Reduce per-character overhead in COPY OUT by combining calls toTom Lane2006-05-25
| | | | CopySendData.
* Change the backend to reject strings containing invalidly-encoded multibyteTom Lane2006-05-21
| | | | | | | | | | | | | | | | | | | | characters in all cases. Formerly we mostly just threw warnings for invalid input, and failed to detect it at all if no encoding conversion was required. The tighter check is needed to defend against SQL-injection attacks as per CVE-2006-2313 (further details will be published after release). Embedded zero (null) bytes will be rejected as well. The checks are applied during input to the backend (receipt from client or COPY IN), so it no longer seems necessary to check in textin() and related routines; any string arriving at those functions will already have been validated. Conversion failure reporting (for characters with no equivalent in the destination encoding) has been cleaned up and made consistent while at it. Also, fix a few longstanding errors in little-used encoding conversion routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic, mic_to_euc_tw were all broken to varying extents. Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki for identifying the security issues.
* Fix a bunch of problems with domains by making them use special input functionsTom Lane2006-04-05
| | | | | | | | | | | that apply the necessary domain constraint checks immediately. This fixes cases where domain constraints went unchecked for statement parameters, PL function local variables and results, etc. We can also eliminate existing special cases for domains in places that had gotten it right, eg COPY. Also, allow domains over domains (base of a domain is another domain type). This almost worked before, but was disallowed because the original patch hadn't gotten it quite right.
* Modify all callers of datatype input and receive functions so that if theseTom Lane2006-04-04
| | | | | | | | | | | | | | | functions are not strict, they will be called (passing a NULL first parameter) during any attempt to input a NULL value of their datatype. Currently, all our input functions are strict and so this commit does not change any behavior. However, this will make it possible to build domain input functions that centralize checking of domain constraints, thereby closing numerous holes in our domain support, as per previous discussion. While at it, I took the opportunity to introduce convenience functions InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O functions. This eliminates a lot of grotty-looking casts, but the main motivation is to make it easier to grep for these places if we ever need to touch them again.
* Add error location info to ResTarget parse nodes. Allows error cursor to be ↵Tom Lane2006-03-23
| | | | | | supplied for various mistakes involving INSERT and UPDATE target columns.
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Make the COPY command return a command tag that includes the number ofTom Lane2006-03-03
| | | | | rows copied. Backend side of Volkan Yazici's recent patch, with corrections and documentation.
* Prevent COPY from using newline or carriage return as delimiter or null.Bruce Momjian2006-02-03
| | | | | | Disallow backslash as the delimiter in non-CVS mode. David Fetter
* Add regression tests for CSV and \., and add automatic quoting of aBruce Momjian2005-12-28
| | | | | single column dump that has a \. value, so the load works properly. I also added documentation describing this issue.
* Our code had:Bruce Momjian2005-12-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if (c == '\\' && cstate->line_buf.len == 0) The problem with that is the because of the input and _output_ buffering, cstate->line_buf.len could be zero even if we are not on the first character of a line. In fact, for a typical line, it is zero for all characters on the line. The proper solution is to introduce a boolean, first_char_in_line, that we set as we enter the loop and clear once we process a character. I have restructured the line-reading code in copy.c by: o merging the CSV/non-CSV functions into a single function o used macros to centralize and clarify the buffering code o updated comments o renamed client_encoding_only to encoding_embeds_ascii o added a high-bit test to the encoding_embeds_ascii test for performance o in CSV mode, allow a backslash followed by a non-period to continue being processed as a data value There should be no performance impact from this patch because it is functionally equivalent. If you apply the patch you will see copy.c is much clearer in this area now and might suggest additional optimizations. I have also attached a 8.1-only patch to fix the CSV \. handling bug with no code restructuring.
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Rename the members of CommandDest enum so they don't collide with other uses ofAlvaro Herrera2005-11-03
| | | | | those names. (Debug and None were pretty bad names anyway.) I hope I catched all uses of the names in comments too.
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* COPY's test for read-only transaction was backward; it prohibited COPY TOTom Lane2005-10-03
| | | | where it should prohibit COPY FROM. Found by Alon Goldshuv.
* Clean up possibly-uninitialized-variable warnings reported by gcc 4.x.Tom Lane2005-09-24
|
* Suppress signed-vs-unsigned-char warnings.Tom Lane2005-09-24
|
* Fix unportable uses of <ctype.h> functions. Per Sergey Koposov.Tom Lane2005-09-01
|
* COPY performance improvements. Avoid calling CopyGetData for each inputTom Lane2005-08-06
| | | | | | | | | | character, tighten the inner loops of CopyReadLine and CopyReadAttribute, arrange to parse out all the attributes of a line in just one call instead of one CopyReadAttribute call per attribute, be smarter about which client encodings require slow pg_encoding_mblen() loops. Also, clean up the mishmash of static variables and overly-long parameter lists in favor of passing around a single CopyState struct containing all the state data. Original patch by Alon Goldshuv, reworked by Tom Lane.
* Change typreceive function API so that receive functions get the sameTom Lane2005-07-10
| | | | | | | optional arguments as text input functions, ie, typioparam OID and atttypmod. Make all the datatypes that use typmod enforce it the same way in typreceive as they do in typinput. This fixes a problem with failure to enforce length restrictions during COPY FROM BINARY.
* Replace pg_shadow and pg_group by new role-capable catalogs pg_authidTom Lane2005-06-28
| | | | | | | | and pg_auth_members. There are still many loose ends to finish in this patch (no documentation, no regression tests, no pg_dump support for instance). But I'm going to commit it now anyway so that Alvaro can make some progress on shared dependencies. The catalog changes should be pretty much done.
* Add support for \x hex escapes in COPY.Bruce Momjian2005-06-02
| | | | Sergey Ten
* Add COPY WITH CVS HEADER to allow a heading line as the first line inBruce Momjian2005-05-07
| | | | | | COPY. Andrew Dunstan
* For some reason access/tupmacs.h has been #including utils/memutils.h,Tom Lane2005-05-06
| | | | | | | which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.
* Convert some mulit-line comments in copy.c to single line, as appropriate.Bruce Momjian2005-05-06
|