aboutsummaryrefslogtreecommitdiff
path: root/src/backend/commands/copy.c
Commit message (Collapse)AuthorAge
* Use PRI?64 instead of "ll?" in format strings (continued).Peter Eisentraut2025-03-29
| | | | | | | Continuation of work started in commit 15a79c73, after initial trial. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Introduce CompactAttribute array in TupleDesc, take 2David Rowley2024-12-20
| | | | | | | | | | | | | | | | | | | | | | | | The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
* file_fdw: Add REJECT_LIMIT option to file_fdw.Fujii Masao2024-11-20
| | | | | | | | | | | | | | | | | | | | Commit 4ac2a9bece introduced the REJECT_LIMIT option for the COPY command. This commit extends the support for this option to file_fdw. As well as REJECT_LIMIT option for COPY, this option limits the maximum number of erroneous rows that can be skipped. If the number of data type conversion errors exceeds this limit, accessing the file_fdw foreign table will fail with an error, even when on_error = 'ignore' is specified. Since the CREATE/ALTER FOREIGN TABLE commands require foreign table options to be single-quoted, this commit updates defGetCopyRejectLimitOption() to handle also string value for them, in addition to int64 value for COPY command option. Author: Atsushi Torikoshi Reviewed-by: Fujii Masao, Yugo Nagata, Kirill Reshke Discussion: https://postgr.es/m/bab68a9fc502b12693f0755b6f35f327@oss.nttdata.com
* Fix validation of COPY FORCE_NOT_NULL/FORCE_NULL for the all-column casesMichael Paquier2024-10-17
| | | | | | | | | | | | | | | | | | This commit adds missing checks for COPY FORCE_NOT_NULL and FORCE_NULL when applied to all columns via "*". These options now correctly require CSV mode and are disallowed in COPY TO, making their behavior consistent with FORCE_QUOTE. Some regression tests are added to verify the correct behavior for the all-columns case, including FORCE_QUOTE, which was not tested. Backpatch down to 17, where support for the all-column grammar with FORCE_NOT_NULL and FORCE_NULL has been added. Author: Joel Jacobson Reviewed-by: Zhang Mingli Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com Backpatch-through: 17
* Move check for binary mode and on_error option to the appropriate location.Fujii Masao2024-10-08
| | | | | | | | | | | | | | | Commit 9e2d870119 placed the check for binary mode and on_error before default values were inserted, which was not ideal. This commit moves the check to a more appropriate position after default values are set. Additionally, the comment incorrectly mentioned two checks before inserting defaults, when there are actually three. This commit corrects that comment. Author: Atsushi Torikoshi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/8830518a-28ac-43a2-8a11-1676d9a3cdf8@oss.nttdata.com
* Add REJECT_LIMIT option to the COPY command.Fujii Masao2024-10-08
| | | | | | | | | | | | | | | | Previously, when ON_ERROR was set to 'ignore', the COPY command would skip all rows with data type conversion errors, with no way to limit the number of skipped rows before failing. This commit introduces the REJECT_LIMIT option, allowing users to specify the maximum number of erroneous rows that can be skipped. If more rows encounter data type conversion errors than allowed by REJECT_LIMIT, the COPY command will fail with an error, even when ON_ERROR = 'ignore'. Author: Atsushi Torikoshi Reviewed-by: Junwang Zhao, Kirill Reshke, jian he, Fujii Masao Discussion: https://postgr.es/m/63f99327aa6b404cc951217fa3e61fe4@oss.nttdata.com
* Add log_verbosity = 'silent' support to COPY command.Fujii Masao2024-10-03
| | | | | | | | | | | | | | | | | | | | | Previously, when the on_error option was set to ignore, the COPY command would always log NOTICE messages for input rows discarded due to data type incompatibility. Users had no way to suppress these messages. This commit introduces a new log_verbosity setting, 'silent', which prevents the COPY command from emitting NOTICE messages when on_error = 'ignore' is used, even if rows are discarded. This feature is particularly useful when processing malformed files frequently, where a flood of NOTICE messages can be undesirable. For example, when frequently loading malformed files via the COPY command or querying foreign tables using file_fdw (with an upcoming patch to add on_error support for file_fdw), users may prefer to suppress these messages to reduce log noise and improve clarity. Author: Atsushi Torikoshi Reviewed-by: Masahiko Sawada, Fujii Masao Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
* Refactor error messages to reduce duplicationAlvaro Herrera2024-08-08
| | | | | | | | | | | | | | | I also took the liberty of changing errmsg("COPY DEFAULT only available using COPY FROM") to errmsg("COPY %s cannot be used with %s", "DEFAULT", "COPY TO") because the original wording is unlike all other messages that indicate option incompatibility. This message was added by commit 9f8377f7a279 (16-era), in whose development thread there was no discussion on this point. Backpatch to 17.
* Disallow specifying ON_ERROR option without value.Masahiko Sawada2024-04-17
| | | | | | | | | | | | | | The ON_ERROR option of the COPY command previously allowed omitting its value, which was inconsistent with the syntax synopsis in the documentation and the behavior of other non-boolean COPY options. This change enforces providing a value for the ON_ERROR option, ensuring consistency across other non-boolean options and aligning with the documented syntax. Author: Atsushi Torikoshi Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/a9770bf57646d90dedc3d54cf32634b2%40oss.nttdata.com
* Add new COPY option LOG_VERBOSITY.Masahiko Sawada2024-04-01
| | | | | | | | | | | | | | | | This commit adds a new COPY option LOG_VERBOSITY, which controls the amount of messages emitted during processing. Valid values are 'default' and 'verbose'. This is currently used in COPY FROM when ON_ERROR option is set to ignore. If 'verbose' is specified, a NOTICE message is emitted for each discarded row, providing additional information such as line number, column name, and the malformed value. This helps users to identify problematic rows that failed to load. Author: Bharath Rupireddy Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
* Add RETURNING support to MERGE.Dean Rasheed2024-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows a RETURNING clause to be appended to a MERGE query, to return values based on each row inserted, updated, or deleted. As with plain INSERT, UPDATE, and DELETE commands, the returned values are based on the new contents of the target table for INSERT and UPDATE actions, and on its old contents for DELETE actions. Values from the source relation may also be returned. As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be used as the source relation for other operations such as WITH queries and COPY commands. Additionally, a special function merge_action() is provided, which returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action executed for each row. The merge_action() function can be used anywhere in the RETURNING list, including in arbitrary expressions and subqueries, but it is an error to use it anywhere outside of a MERGE query's RETURNING list. Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera, Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut, and Wolfgang Walther. Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com
* Remove unused #include's from backend .c filesPeter Eisentraut2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
* Rename COPY option from SAVE_ERROR_TO to ON_ERRORAlexander Korotkov2024-01-19
| | | | | | | | | The option names now are "stop" (default) and "ignore". The future options could be "file 'filename.log'" and "table 'tablename'". Discussion: https://postgr.es/m/20240117.164859.2242646601795501168.horikyota.ntt%40gmail.com Author: Jian He Reviewed-by: Atsushi Torikoshi
* Add new COPY option SAVE_ERROR_TOAlexander Korotkov2024-01-16
| | | | | | | | | | | | | | | | | | | | | Currently, when source data contains unexpected data regarding data type or range, the entire COPY fails. However, in some cases, such data can be ignored and just copying normal data is preferable. This commit adds a new option SAVE_ERROR_TO, which specifies where to save the error information. When this option is specified, COPY skips soft errors and continues copying. Currently, SAVE_ERROR_TO only supports "none". This indicates error information is not saved and COPY just skips the unexpected data and continues running. Later works are expected to add more choices, such as 'log' and 'table'. Author: Damir Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He Discussion: https://postgr.es/m/87k31ftoe0.fsf_-_%40commandprompt.com Reviewed-by: Pavel Stehule, Andres Freund, Tom Lane, Daniel Gustafsson, Reviewed-by: Alena Rybakina, Andy Fan, Andrei Lepikhov, Masahiko Sawada Reviewed-by: Vignesh C, Atsushi Torikoshi
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add error about the use of FREEZE in COPY TOBruce Momjian2023-11-13
| | | | | | | | | | Also clarify some other error wording. Reported-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20220802.133046.1941977979333284049.horikyota.ntt@gmail.com Backpatch-through: master
* Provide FORCE_NULL * and FORCE_NOT_NULL * options for COPY FROMAndrew Dunstan2023-09-30
| | | | | | | | | | | | These options already exist, but you need to specify a column list for them, which can be cumbersome. We already have the possibility of all columns for FORCE QUOTE, so this is simply extending that facility to FORCE_NULL and FORCE_NOT_NULL. Author: Zhang Mingli Reviewed-By: Richard Guo, Kyatoro Horiguchi, Michael Paquier. Discussion: https://postgr.es/m/CACJufxEnVqzOFtqhexF2+AwOKFrV8zHOY3y=p+gPK6eB14pn_w@mail.gmail.com
* Improve several permission-related error messages.Peter Eisentraut2023-03-17
| | | | | | | | | Mainly move some detail from errmsg to errdetail, remove explicit mention of superuser where appropriate, since that is implied in most permission checks, and make messages more uniform. Author: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/20230316234701.GA903298@nathanxps13
* Add a DEFAULT option to COPY FROMAndrew Dunstan2023-03-13
| | | | | | | | | | | | | | This allows for a string which if an input field matches causes the column's default value to be inserted. The advantage of this is that the default can be inserted in some rows and not others, for which non-default data is available. The file_fdw extension is also modified to take allow use of this option. Israel Barth Rubio Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
* Ensure COPY TO on an RLS-enabled table copies no more than it should.Tom Lane2023-03-10
| | | | | | | | | | | | | | The COPY documentation is quite clear that "COPY relation TO" copies rows from only the named table, not any inheritance children it may have. However, if you enabled row-level security on the table then this stopped being true, because the code forgot to apply the ONLY modifier in the "SELECT ... FROM relation" query that it constructs in order to allow RLS predicates to be attached. Fix that. Report and patch by Antonin Houska (comment adjustments and test case by me). Back-patch to all supported branches. Discussion: https://postgr.es/m/3472.1675251957@antos
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Rework query relation permission checkingAlvaro Herrera2022-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
* Add support for COPY TO callback functionsMichael Paquier2022-10-11
| | | | | | | | | | | | | | | | | This is useful as a way for extensions to process COPY TO rows in the way they see fit (say auditing, analytics, backend, etc.) without the need to invoke an external process running as the OS user running the backend through PROGRAM that requires superuser rights. COPY FROM already provides a similar callback for logical replication. For COPY TO, the callback is triggered when we are ready to send a row in CopySendEndOfRow(), which is the same code path as when sending a row to a frontend or a pipe/file. A small test module, test_copy_callbacks, is added to provide some coverage for this facility. Author: Bilva Sanaba, Nathan Bossart Discussion: https://postgr.es/m/253C21D1-FCEB-41D9-A2AF-E6517015B7D7@amazon.com
* Reject MERGE in CTEs and COPYAlvaro Herrera2022-08-12
| | | | | | | | | | | | The grammar added for MERGE inadvertently made it accepted syntax in places that were not prepared to deal with it -- namely COPY and inside CTEs, but invoking these things with MERGE currently causes assertion failures or weird misbehavior in non-assertion builds. Protect those places by checking for it explicitly until somebody decides to implement it. Reported-by: Alexey Borzov <borz_off@cs.msu.su> Discussion: https://postgr.es/m/17579-82482cd7b267b862@postgresql.org
* Improve two comments related to a boolean DefElem's valueMichael Paquier2022-07-11
| | | | | | | | | | | | The original comments mentioned a "parameter" as something not defined in a fast-exit path to assume a true status. This is rather confusing as the parameter DefElem is defined, and the intention is to check if its value is defined. This improves both comments to mention the value assigned to the DefElem's value instead, so as future patches are able to catch the tweak if this code pattern gets copied around more. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv0yWynWTmp4o34s0d98xVubys9fy=p0YXsZ5_sUcNnMw@mail.gmail.com
* Fix two issues with HEADER MATCH in COPYMichael Paquier2022-06-23
| | | | | | | | | | | | | | | | | | | | 072132f0 used the attnum offset to access the raw_fields array when checking that the attribute names of the header and of the relation match, leading to incorrect results or even crashes if the attribute numbers of a relation are changed, like on a dropped attribute. This fixes the logic to use the correct attribute names for the header matching requirements. Also, this commit disallows HEADER MATCH in COPY TO as there is no validation that can be done in this case. The tests are expanded for HEADER MATCH with COPY FROM and dropped columns, with cases where a relation has a dropped and re-added column, as well as a reduced set of columns. Author: Julien Rouhaud Reviewed-by: Peter Eisentraut, Michael Paquier Discussion: https://postgr.es/m/20220607154744.vvmitnqhyxrne5ms@jrouhaud
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* Add header matching mode to COPY FROMPeter Eisentraut2022-03-30
| | | | | | | | | | | | | | | | COPY FROM supports the HEADER option to silently discard the header line from a CSV or text file. It is possible to load by mistake a file that matches the expected format, for example, if two text columns have been swapped, resulting in garbage in the database. This adds a new option value HEADER MATCH that checks the column names in the header line against the actual column names and errors out if they do not match. Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr> Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
* Use has_privs_for_roles for predefined role checksJoe Conway2022-03-28
| | | | | | | | | | | | | | Generally if a role is granted membership to another role with NOINHERIT they must use SET ROLE to access the privileges of that role, however with predefined roles the membership and privilege is conflated. Fix that by replacing is_member_of_role with has_privs_for_role for predefined roles. Patch does not remove is_member_of_role from acl.h, but it does add a warning not to use that function for privilege checking. Not backpatched based on hackers list discussion. Author: Joshua Brindle Reviewed-by: Stephen Frost, Nathan Bossart, Joe Conway Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com
* Add HEADER support to COPY text formatPeter Eisentraut2022-01-28
| | | | | | | | | | The COPY CSV format supports the HEADER option to output a header line. This patch adds the same option to the default text format. On input, the HEADER option causes the first line to be skipped, same as with CSV. Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr> Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Remove Value node structPeter Eisentraut2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Value node struct is a weird construct. It is its own node type, but most of the time, it actually has a node type of Integer, Float, String, or BitString. As a consequence, the struct name and the node type don't match most of the time, and so it has to be treated specially a lot. There doesn't seem to be any value in the special construct. There is very little code that wants to accept all Value variants but nothing else (and even if it did, this doesn't provide any convenient way to check it), and most code wants either just one particular node type (usually String), or it accepts a broader set of node types besides just Value. This change removes the Value struct and node type and replaces them by separate Integer, Float, String, and BitString node types that are proper node types and structs of their own and behave mostly like normal node types. Also, this removes the T_Null node tag, which was previously also a possible variant of Value but wasn't actually used outside of the Value contained in A_Const. Replace that by an isnull field in A_Const. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
* Improve reporting of "conflicting or redundant options" errors.Dean Rasheed2021-07-15
| | | | | | | | | | | | | | | | | | | | | When reporting "conflicting or redundant options" errors, try to ensure that errposition() is used, to help the user identify the offending option. Formerly, errposition() was invoked in less than 60% of cases. This patch raises that to over 90%, but there remain a few places where the ParseState is not readily available. Using errdetail() might improve the error in such cases, but that is left as a task for the future. Additionally, since this error is thrown from over 100 places in the codebase, introduce a dedicated function to throw it, reducing code duplication. Extracted from a slightly larger patch by Vignesh C. Reviewed by Bharath Rupireddy, Alvaro Herrera, Dilip Kumar, Hou Zhijie, Peter Smith, Daniel Gustafsson, Julien Rouhaud and me. Discussion: https://postgr.es/m/CALDaNm33FFSS5tVyvmkoK2cCMuDVxcui=gFrjti9ROfynqSAGA@mail.gmail.com
* Rename Default Roles to Predefined RolesStephen Frost2021-04-01
| | | | | | | | | | | | | The term 'default roles' wasn't quite apt as these roles aren't able to be modified or removed after installation, so rename them to be 'Predefined Roles' instead, adding an entry into the newly added Obsolete Appendix to help users of current releases find the new documentation. Bruce Momjian and Stephen Frost Discussion: https://postgr.es/m/157742545062.1149.11052653770497832538%40wrigleys.postgresql.org and https://www.postgresql.org/message-id/20201120211304.GG16415@tamriel.snowman.net
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Split copy.c into four files.Heikki Linnakangas2020-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Copy.c has grown really large. Split it into more manageable parts: - copy.c now contains only a few functions that are common to COPY FROM and COPY TO. - copyto.c contains code for COPY TO. - copyfrom.c contains code for initializing COPY FROM, and inserting the tuples to the correct table. - copyfromparse.c contains code for reading from the client/file/program, and parsing the input text/CSV/binary format into tuples. All of these parts are fairly complicated, and fairly independent of each other. There is a patch being discussed to implement parallel COPY FROM, which will add a lot of new code to the COPY FROM path, and another patch which would allow INSERTs to use the same multi-insert machinery as COPY FROM, both of which will require refactoring that code. With those two patches, there's going to be a lot of code churn in copy.c anyway, so now seems like a good time to do this refactoring. The CopyStateData struct is also split. All the formatting options, like FORMAT, QUOTE, ESCAPE, are put in a new CopyFormatOption struct, which is used by both COPY FROM and TO. Other state data are kept in separate CopyFromStateData and CopyToStateData structs. Reviewed-by: Soumyadeep Chakraborty, Erik Rijkers, Vignesh C, Andres Freund Discussion: https://www.postgresql.org/message-id/8e15b560-f387-7acc-ac90-763986617bfb%40iki.fi
* Fix some grammar and typos in comments and docsMichael Paquier2020-11-02
| | | | | | | | The documentation fixes are backpatched down to where they apply. Author: Justin Pryzby Discussion: https://postgr.es/m/20201031020801.GD3080@telsasoft.com Backpatch-through: 9.6
* Remove PartitionRoutingInfo struct.Heikki Linnakangas2020-10-19
| | | | | | | | | | The extra indirection neeeded to access its members via its enclosing ResultRelInfo seems pointless. Move all the fields from PartitionRoutingInfo to ResultRelInfo. Author: Amit Langote Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com
* Revise child-to-root tuple conversion map management.Heikki Linnakangas2020-10-19
| | | | | | | | | | | | | | | | | | | | | | | Store the tuple conversion map to convert a tuple from a child table's format to the root format in a new ri_ChildToRootMap field in ResultRelInfo. It is initialized if transition tuple capture for FOR STATEMENT triggers or INSERT tuple routing on a partitioned table is needed. Previously, ModifyTable kept the maps in the per-subplan ModifyTableState->mt_per_subplan_tupconv_maps array, or when tuple routing was used, in ResultRelInfo->ri_Partitioninfo->pi_PartitionToRootMap. The new field replaces both of those. Now that the child-to-root tuple conversion map is always available in ResultRelInfo (when needed), remove the TransitionCaptureState.tcs_map field. The callers of Exec*Trigger() functions no longer need to set or save it, which is much less confusing and bug-prone. Also, as a future optimization, this will allow us to delay creating the map for a given result relation until the relation is actually processed during execution. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqHtCWLdK-LO%3DNEsvOdHx%2B7yv4mE_zYK0i3BH7dXb-wxog%40mail.gmail.com
* Remove es_result_relation_info from EState.Heikki Linnakangas2020-10-14
| | | | | | | | | | | | | | Maintaining 'es_result_relation_info' correctly at all times has become cumbersome, especially with partitioning where each partition gets its own result relation info. Having to set and reset it across arbitrary operations has caused bugs in the past. This changes all the places that used 'es_result_relation_info', to receive the currently active ResultRelInfo via function parameters instead. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com
* Create ResultRelInfos later in InitPlan, index them by RT index.Heikki Linnakangas2020-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of allocating all the ResultRelInfos upfront in one big array, allocate them in ExecInitModifyTable(). es_result_relations is now an array of ResultRelInfo pointers, rather than an array of structs, and it is indexed by the RT index. This simplifies things: we get rid of the separate concept of a "result rel index", and don't need to set it in setrefs.c anymore. This also allows follow-up optimizations (not included in this commit yet) to skip initializing ResultRelInfos for target relations that were not needed at runtime, and removal of the es_result_relation_info pointer. The EState arrays of regular result rels and root result rels are merged into one array. Similarly, the resultRelations and rootResultRelations lists in PlannedStmt are merged into one. It's not actually clear to me why they were kept separate in the first place, but now that the es_result_relations array is indexed by RT index, it certainly seems pointless. The PlannedStmt->resultRelations list is now only needed for ExecRelationIsTargetRelation(). One visible effect of this change is that ExecRelationIsTargetRelation() will now return 'true' also for the partition root, if a partitioned table is updated. That seems like a good thing, although the function isn't used in core code, and I don't see any reason for an FDW to call it on a partition root. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com
* Fix handling of redundant options with COPY for "freeze" and "header"Michael Paquier2020-10-05
| | | | | | | | | | | | | | | | | | | | | | The handling of those options was inconsistent, as the processing used directly the value assigned to the option to check if it was redundant, leading to patterns like this one to succeed (note that false is specified first): COPY hoge to '/path/to/file/' (header off, header on); And the opposite would fail correctly (note that true is first here): COPY hoge to '/path/to/file/' (header on, header off); While on it, add some tests to check for all redundant patterns with the options of COPY. I have gone through the code and did not notice similar mistakes for other commands. "header" got it wrong since b63990c, and "freeze" was wrong from the start as of 8de72b6. No backpatch is done per the lack of complaints. Reported-by: Rémi Lapeyre Discussion: https://postgr.es/m/20200929072433.GA15570@paquier.xyz Discussion: https://postgr.es/m/0B55BD07-83E4-439F-AACC-FA2D7CF50532@lenstra.fr
* Don't fetch partition check expression during InitResultRelInfo.Tom Lane2020-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since there is only one place that actually needs the partition check expression, namely ExecPartitionCheck, it's better to fetch it from the relcache there. In this way we will never fetch it at all if the query never has use for it, and we still fetch it just once when we do need it. The reason for taking an interest in this is that if the relcache doesn't already have the check expression cached, fetching it requires obtaining AccessShareLock on the partition root. That means that operations that look like they should only touch the partition itself will also take a lock on the root. In particular we observed that TRUNCATE on a partition may take a lock on the partition's root, contributing to a deadlock situation in parallel pg_restore. As written, this patch does have a small cost, which is that we are microscopically reducing efficiency for the case where a partition has an empty check expression. ExecPartitionCheck will be called, and will go through the motions of setting up and checking an empty qual, where before it would not have been called at all. We could avoid that by adding a separate boolean flag to track whether there is a partition expression to test. However, this case only arises for a default partition with no siblings, which surely is not an interesting case in practice. Hence adding complexity for it does not seem like a good trade-off. Amit Langote, per a suggestion by me Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
* Improve performance of binary COPY FROM through better buffering.Tom Lane2020-07-25
| | | | | | | | | | | | | | | | | | At least on Linux and macOS, fread() turns out to have far higher per-call overhead than one could wish. Reading 64KB of data at a time and then parceling it out with our own memcpy logic makes binary COPY from a file significantly faster --- around 30% in simple testing for cases with narrow text columns (on Linux ... even more on macOS). In binary COPY from frontend, there's no per-call fread(), and this patch introduces an extra layer of memcpy'ing, but it still manages to eke out a small win. Apparently, the control-logic overhead in CopyGetData() is enough to be worth avoiding for small fetches. Bharath Rupireddy and Amit Langote, reviewed by Vignesh C, cosmetic tweaks by me Discussion: https://postgr.es/m/CALj2ACU5Bz06HWLwqSzNMN=Gupoj6Rcn_QVC+k070V4em9wu=A@mail.gmail.com
* Add comment to explain an unused function parameterDavid Rowley2020-07-14
| | | | | | | | | | | Removing the unused 'miinfo' parameter has been raised a couple of times now. It was decided in the 2nd discussion below that we're going to leave it alone. It seems like it might be useful to add a comment to mention this fact so that nobody wastes any time in the future proposing its removal again. Discussion: https://postgr.es/m/CAApHDvpCf-qR5HC1rXskUM4ToV+3YDb4-n1meY=vpAHsRS_1PA@mail.gmail.com Discussion: https://postgr.es/m/CAE9k0P%3DFvcDswnSVtRpSyZMpcAWC%3DGp%3DifZ0HdfPaRQ%3D__LBtw%40mail.gmail.com
* Avoid useless buffer allocations during binary COPY FROM.Tom Lane2020-07-11
| | | | | | | | | | | The raw_buf and line_buf buffers aren't used when reading binary format, so skip allocating them. raw_buf is 64K so that seems like a worthwhile savings. An unused line_buf only wastes 1K, but as long as we're checking it's free to avoid allocating that too. Bharath Rupireddy, tweaked a bit by me Discussion: https://postgr.es/m/CALj2ACXcCKaGPY0whowqrJ4OPJvDnTssgpGCzvuFQu5z0CXb-g@mail.gmail.com
* Mop up some no-longer-necessary hacks around printf %.*s format.Tom Lane2020-06-29
| | | | | | | | | | | | | | | | | | Commit 54cd4f045 added some kluges to work around an old glibc bug, namely that %.*s could misbehave if glibc thought any characters in the supplied string were incorrectly encoded. Now that we use our own snprintf.c implementation, we need not worry about that bug (even if it still exists in the wild). Revert a couple of particularly ugly hacks, and remove or improve assorted comments. Note that there can still be encoding-related hazards here: blindly clipping at a fixed length risks producing wrongly-encoded output if the clip splits a multibyte character. However, code that's doing correct multibyte-aware clipping doesn't really need a comment about that, while code that isn't needs an explanation why not, rather than a red-herring comment about an obsolete bug. Discussion: https://postgr.es/m/279428.1593373684@sss.pgh.pa.us
* Removal unused function parameter in CopyReadBinaryAttribute.Amit Kapila2020-06-20
| | | | | | | | | | | | The function parameter column_no is not used in CopyReadBinaryAttribute, this can be removed. Commit 0e319c7ad7 removed the usage of column_no parameter in function CopyReadBinaryAttribute but forgot to remove the parameter. Reported-by: Vignesh C Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm1TYSNTfqx_jfz9_mwEZ2Er=dZnu++duXpC1uQo1cG=WA@mail.gmail.com
* Make COPY TO keep locks until the transaction end.Amit Kapila2020-05-15
| | | | | | | | | | | | | | | | | | COPY TO released the ACCESS SHARE lock immediately when it was done rather than holding on to it until the end of the transaction. This breaks the case where a REPEATABLE READ transaction could see an empty table if it repeats a COPY statement and somebody truncated the table in the meantime. Before 4dded12faad the lock was also released after COPY FROM, but the commit failed to notice the irregularity in COPY TO. This is old behavior but doesn't seem important enough to backpatch. Author: Laurenz Albe, based on suggestion by Robert Haas and Tom Lane Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/7bcfc39d4176faf85ab317d0c26786953646a411.camel@cybertec.at