aboutsummaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw/postgres_fdw.c
Commit message (Collapse)AuthorAge
* Sample postgres_fdw tables remotely during ANALYZETomas Vondra2022-12-30
| | | | | | | | | | | | | | | | | When collecting ANALYZE sample on foreign tables, postgres_fdw fetched all rows and performed the sampling locally. For large tables this means transferring and immediately discarding large amounts of data. This commit allows the sampling to be performed on the remote server, transferring only the much smaller sample. The sampling is performed using the built-in TABLESAMPLE methods (system, bernoulli) or random() function, depending on the remote server version. Remote sampling can be enabled by analyze_sampling on the foreign server and/or foreign table, with supported values 'off', 'auto', 'system', 'bernoulli' and 'random'. The default value is 'auto' which uses either 'bernoulli' (TABLESAMPLE method) or 'random' (for remote servers without TABLESAMPLE support).
* Allow batching of inserts during cross-partition updates.Etsuro Fujita2022-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 927f453a9 disallowed batching added by commit b663a4136 to be used for the inserts performed as part of cross-partition updates of partitioned tables, mainly because the previous code in nodeModifyTable.c couldn't handle pending inserts into foreign-table partitions that are also UPDATE target partitions. But we don't have such a limitation anymore (cf. commit ffbb7e65a), so let's allow for this by removing from execPartition.c the restriction added by commit 927f453a9 that batching is only allowed if the query command type is CMD_INSERT. In postgres_fdw, since commit 86dc90056 changed it to effectively disable cross-partition updates in the case where a foreign-table partition chosen to insert rows into is also an UPDATE target partition, allow batching in the case where a foreign-table partition chosen to do so is *not* also an UPDATE target partition. This is enabled by the "batch_size" option added by commit b663a4136, which is disabled by default. This patch also adjusts the test case added by commit 927f453a9 to confirm that the inserts performed as part of a cross-partition update of a partitioned table indeed uses batching. Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu, Bharath Rupireddy, Hou Zhijie, Vignesh C, and me. Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
* C comment: fix wordingBruce Momjian2022-12-16
| | | | Backpatch-through: master
* Rework query relation permission checkingAlvaro Herrera2022-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
* Stop accessing checkAsUser via RTE in some casesAlvaro Herrera2022-11-30
| | | | | | | | | | | | | | | | | | | | A future commit will move the checkAsUser field from RangeTblEntry to a new node that, unlike RTEs, will only be created for tables mentioned in the query but not for the inheritance child relations added to the query by the planner. So, checkAsUser value for a given child relation will have to be obtained by referring to that for its ancestor mentioned in the query. In preparation, it seems better to expand the use of RelOptInfo.userid during planning in place of rte->checkAsUser so that there will be fewer places to adjust for the above change. Given that the child-to-ancestor mapping is not available during the execution of a given "child" ForeignScan node, add a checkAsUser field to ForeignScan to carry the child relation's RelOptInfo.userid. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGFCs2uq7VRKi7g+FFKbP6Ea_2_HkgZb2HPhUfaAKT3ng@mail.gmail.com
* Fix confusion about havingQual vs hasHavingQual in planner.Tom Lane2022-10-18
| | | | | | | | | | | | | | | | | | | | | | | Preprocessing of the HAVING clause will reduce havingQual to NIL if the clause is constant-TRUE. This is one case where that convention is rather unfortunate, because "HAVING TRUE" is not at all the same as not having any HAVING clause at all. (Per the SQL spec, it still forces the query to be grouped.) The planner deals with this by having a boolean hasHavingQual that records whether havingQual was originally nonempty; places that just want to check whether HAVING was specified are supposed to consult that. I found three places that got that wrong. Fortunately, these could only affect cost estimates not correctness. It'd be hard even to demonstrate the errors; for example, the one in allpaths.c would only matter in a query that has HAVING TRUE but no GROUP BY and no aggregates, which would require a completely variable-free SELECT list, making the case probably of only academic interest. Hence, while these are worth fixing before someone copies the incorrect coding somewhere more critical, they don't seem worth back-patching. I didn't bother trying to devise regression tests, either. Discussion: https://postgr.es/m/2503888.1666042643@sss.pgh.pa.us
* Allow batch insertion during COPY into a foreign table.Etsuro Fujita2022-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3d956d956 allowed the COPY, but it's done by inserting individual rows to the foreign table, so it can be inefficient due to the overhead caused by each round-trip to the foreign server. To improve performance of the COPY in such a case, this patch allows batch insertion, by extending the multi-insert machinery in CopyFrom() to the foreign-table case so that we insert multiple rows to the foreign table at once using the FDW callback routine added by commit b663a4136. This patch also allows this for postgres_fdw. It is enabled by the "batch_size" option added by commit b663a4136, which is disabled by default. When doing batch insertion, we update progress of the COPY command after performing the FDW callback routine, to count rows not suppressed by the FDW as well as a BEFORE ROW INSERT trigger. For consistency, this patch changes the timing of updating it for plain tables: previously, we updated it immediately after adding each row to the multi-insert buffer, but we do so only after writing the rows stored in the buffer out to the table using table_multi_insert(), which I think would be consistent even with non-batching mode, because in that mode we update it after writing each row out to the table using table_tuple_insert(). Andrey Lepikhov, heavily revised by me, with review from Ian Barwick, Andrey Lepikhov, and Zhihong Yu. Discussion: https://postgr.es/m/bc489202-9855-7550-d64c-ad2d83c24867%40postgrespro.ru
* Fix final warnings produced by -Wshadow=compatible-localDavid Rowley2022-10-07
| | | | | | | | I thought I had these in d8df67bb1, but per report from Andres Freund, I missed some. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20221005214052.c4tkudawyp5wxt3c@awork3.anarazel.de
* Harmonize parameter names in contrib code.Peter Geoghegan2022-09-22
| | | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in contrib code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
* postgres_fdw: Avoid 'variable not found in subplan target list' error.Etsuro Fujita2022-09-14
| | | | | | | | | | | | | | | | | | | | | | The tlist of the EvalPlanQual outer plan for a ForeignScan node is adjusted to produce a tuple whose descriptor matches the scan tuple slot for the ForeignScan node. But in the case where the outer plan contains an extra Sort node, if the new tlist contained columns required only for evaluating PlaceHolderVars or columns required only for evaluating local conditions, this would cause setrefs.c to fail with the error. The cause of this is that when creating the outer plan by injecting the Sort node into an alternative local join plan that could emit such extra columns as well, we fail to arrange for the outer plan to propagate them up through the Sort node, causing setrefs.c to fail to match up them in the new tlist to what is available from the outer plan. Repair. Per report from Alexander Pyhalov. Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane. Backpatch to all supported versions. Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
* postgres_fdw: Disable batch insertion when there are WCO constraints.Etsuro Fujita2022-08-05
| | | | | | | | | | | | | | | | | | | | | | When inserting a view referencing a foreign table that has WITH CHECK OPTION constraints, in single-insert mode postgres_fdw retrieves the data that was actually inserted on the remote side so that the WITH CHECK OPTION constraints are enforced with the data locally, but in batch-insert mode it cannot currently retrieve the data (except for the row first inserted through the view), resulting in enforcing the WITH CHECK OPTION constraints with the data passed from the core (except for the first-inserted row), which led to incorrect results when inserting into a view referencing a foreign table in which a remote BEFORE ROW INSERT trigger changes the rows inserted through the view so that they violate the view's WITH CHECK OPTION constraint. Also, the query inserting into the view caused an assertion failure in assert-enabled builds. Fix these by disabling batch insertion when inserting into such a view. Back-patch to v14 where batch insertion was added. Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
* postgres_fdw: Fix bug in checking of return value of PQsendQuery().Fujii Masao2022-07-22
| | | | | | | | | | | | | | | | | | When postgres_fdw begins an asynchronous data fetch, it submits FETCH query by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw should report an error. But, previously, postgres_fdw reported an error only when the return value is less than 0, though PQsendQuery() never return the values other than 0 and 1. Therefore postgres_fdw could not handle the failure to send FETCH query in an asynchronous data fetch. This commit fixes postgres_fdw so that it reports an error when PQsendQuery() returns 0. Back-patch to v14 where asynchronous execution was supported in postgres_fdw. Author: Fujii Masao Reviewed-by: Japin Li, Tom Lane Discussion: https://postgr.es/m/b187a7cf-d4e3-5a32-4d01-8383677797f3@oss.nttdata.com
* postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.Tom Lane2022-07-17
| | | | | | | | | | | | | The motivation for this is to ensure successful transmission of the values of constants of regconfig and other reg* types. The remote will be reading them with search_path = 'pg_catalog', so schema qualification is necessary when referencing objects in other schemas. Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported versions. (There's some other stuff to do here, but it's less back-patchable.) Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
* Attempt to fix compiler warning on old compilerPeter Eisentraut2022-07-16
| | | | | A couple more like b449afb582bb9015bfbb85abc10ce122aef9ec70, per complaints from lapwing.
* Replace many MemSet calls with struct initializationPeter Eisentraut2022-07-16
| | | | | | | | | | | | | | This replaces all MemSet() calls with struct initialization where that is easily and obviously possible. (For example, some cases have to worry about padding bits, so I left those.) (The same could be done with appropriate memset() calls, but this patch is part of an effort to phase out MemSet(), so it doesn't touch memset() calls.) Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/9847b13c-b785-f4e2-75c3-12ec77a3b05c@enterprisedb.com
* Remove redundant null pointer checks before PQclear and PQconninfoFreePeter Eisentraut2022-07-03
| | | | | | | | These functions already had the free()-like behavior of handling null pointers as a no-op. But it wasn't documented, so add it explicitly to the documentation, too. Discussion: https://www.postgresql.org/message-id/flat/dac5d2d0-98f5-94d9-8e69-46da2413593d%40enterprisedb.com
* Pre-beta mechanical code beautification.Tom Lane2022-05-12
| | | | | Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.
* postgres_fdw: Disable batch insert when BEFORE ROW INSERT triggers exist.Etsuro Fujita2022-04-21
| | | | | | | | | | | Previously, we allowed this, but such triggers might query the table to insert into and act differently if the tuples that have already been processed and prepared for insertion are not there, so disable it in such cases. Back-patch to v14 where batch insert was added. Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
* Fix postgres_fdw to check shippability of sort clauses properly.Tom Lane2022-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit 7012b132d, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
* postgres_fdw: Fix handling of a pending asynchronous request in ↵Etsuro Fujita2022-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgresReScanForeignScan(). Commit 27e1f1456 failed to process a pending asynchronous request made for a given ForeignScan node in postgresReScanForeignScan() (if any) in cases where we would only reset the next_tuple counter in that function, contradicting the assumption that there should be no pending asynchronous requests that have been made for async-capable subplans for the parent Append node after ReScan. This led to an assert failure in an assert-enabled build. I think this would also lead to mis-rewinding the cursor in that function in the case where we have already fetched one batch for the ForeignScan node and the asynchronous request has been made for the second batch, because even in that case we would just reset the counter when called from that function, so we would fail to execute MOVE BACKWARD ALL. To fix, modify that function to process the asynchronous request before restarting the scan. While at it, add a comment to a function to match other places. Per bug #17344 from Alexander Lakhin. Back-patch to v14 where the aforesaid commit came in. Patch by me. Test case by Alexander Lakhin, adjusted by me. Reviewed and tested by Alexander Lakhin and Dmitry Dolgov. Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
* Add Boolean nodePeter Eisentraut2022-01-17
| | | | | | | | | | Before, SQL-level boolean constants were represented by a string with a cast, and internal Boolean values in DDL commands were usually represented by Integer nodes. This takes the place of both of these uses, making the intent clearer and having some amount of type safety. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Remove PF_USED_FOR_ASSERTS_ONLY from variables in general useDaniel Gustafsson2021-11-30
| | | | | | | | | | | | | | | | fsstate in process_pending_requests (in postgres_fdw.c) was added in 8998e3cafa2 as an assertion-only variable, 1ec7fca8592 stated using the variable outside of assertions. rd_index in get_index_column_opclass (in lsyscache.c) was introduced in 2a6368343ff, and then promptly used in the fix commit 7e041603904 shortly thereafter. This removes the PG_USED_FOR_ASSERTS_ONLY variable decoration from the above mentioned variables. Reviewed-by: Greg Nancarrow <gregn4422@gmail.com> Discussion: https://postgr.es/m/F959106C-0F21-43A5-B2AE-D007D51ACBEE@yesql.se
* Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.Tom Lane2021-11-28
| | | | | | | | | | | | | | | | | | | Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128** is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
* Fix null-pointer crash in postgres_fdw's conversion_error_callback.Tom Lane2021-10-06
| | | | | | | | | | | | | | | | Commit c7b7311f6 adjusted conversion_error_callback to always use information from the query's rangetable, to avoid doing catalog lookups in an already-failed transaction. However, as a result of the utterly inadequate documentation for make_tuple_from_result_row, I failed to realize that fsstate could be NULL in some contexts. That led to a crash if we got a conversion error in such a context. Fix by falling back to the previous coding when fsstate is NULL. Improve the commentary, too. Per report from Andrey Borodin. Back-patch to 9.6, like the previous patch. Discussion: https://postgr.es/m/08916396-55E4-4D68-AB3A-BD6066F9E5C0@yandex-team.ru
* Remove Value node structPeter Eisentraut2021-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | The Value node struct is a weird construct. It is its own node type, but most of the time, it actually has a node type of Integer, Float, String, or BitString. As a consequence, the struct name and the node type don't match most of the time, and so it has to be treated specially a lot. There doesn't seem to be any value in the special construct. There is very little code that wants to accept all Value variants but nothing else (and even if it did, this doesn't provide any convenient way to check it), and most code wants either just one particular node type (usually String), or it accepts a broader set of node types besides just Value. This change removes the Value struct and node type and replaces them by separate Integer, Float, String, and BitString node types that are proper node types and structs of their own and behave mostly like normal node types. Also, this removes the T_Null node tag, which was previously also a possible variant of Value but wasn't actually used outside of the Value contained in A_Const. Replace that by an isnull field in A_Const. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
* Refactor postgresImportForeignSchema to avoid code duplication.Tom Lane2021-09-01
| | | | | | | | | Avoid repeating fragments of the query we're building, along the same lines as recent cleanup in pg_dump. I got annoyed about this because aa769f80e broke my pending patch to change postgres_fdw's collation handling, due to each of us having incompletely done this same refactoring. Let's finish that job in hopes of having a more stable base.
* postgres_fdw: Fix issues with generated columns in foreign tables.Etsuro Fujita2021-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw imported generated columns from the remote tables as plain columns, and caused failures like "ERROR: cannot insert a non-DEFAULT value into column "foo"" when inserting into the foreign tables, as it tried to insert values into the generated columns. To fix, we do the following under the assumption that generated columns in a postgres_fdw foreign table are defined so that they represent generated columns in the underlying remote table: * Send DEFAULT for the generated columns to the foreign server on insert or update, not generated column values computed on the local server. * Add to postgresImportForeignSchema() an option "import_generated" to include column generated expressions in the definitions of foreign tables imported from a foreign server. The option is true by default. The assumption seems reasonable, because that would make a query of the postgres_fdw foreign table return values for the generated columns that are consistent with the generated expression. While here, fix another issue in postgresImportForeignSchema(): it tried to include column generated expressions as column default expressions in the foreign table definitions when the import_default option was enabled. Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated columns were added. Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
* Fix oversight in commit 1ec7fca8592178281cd5cdada0f27a340fb813fc.Etsuro Fujita2021-08-02
| | | | | | | | | | | | | | | | | I failed to account for the possibility that when ExecAppendAsyncEventWait() notifies multiple async-capable nodes using postgres_fdw, a preceding node might invoke process_pending_request() to process a pending asynchronous request made by a succeeding node. In that case the succeeding node should produce a tuple to return to the parent Append node from tuples fetched by process_pending_request() when notified. Repair. Per buildfarm via Michael Paquier. Back-patch to v14, like the previous commit. Thanks to Tom Lane for testing. Discussion: https://postgr.es/m/YQP0UPT8KmPiHTMs%40paquier.xyz
* postgres_fdw: Fix handling of pending asynchronous requests.Etsuro Fujita2021-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pending asynchronous request is handled by process_pending_request(), which previously not only processed an in-progress remote query but performed ExecForeignScan() to produce a tuple to return to the local server asynchronously from the result of the remote query. But that led to a server crash when executing a query or led to an "InstrStartNode called twice in a row" or "InstrEndLoop called on running node" failure when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it contained multiple async-capable nodes accessing the same initplan/subplan that contained multiple async-capable nodes scanning the same foreign tables as for the parent async-capable nodes, as reported by Andrey Lepikhov. The reason is that the second step in process_pending_request() invoked when executing the initplan/subplan for one of the parent async-capable nodes caused recursive execution of the initplan/subplan for another of the parent async-capable nodes. To fix, split process_pending_request() into the two steps and postpone the second step until ForeignAsyncConfigureWait() is called for each of the pending asynchronous requests. Also, in ExecAppendAsyncEventWait() we assumed that FDWs would register at least one wait event in a WaitEventSet created there when they were called from ForeignAsyncConfigureWait() in that function, but allow FDWs to register zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait() to just return in that case. Oversight in commit 27e1f1456. Back-patch to v14 where that commit went in. Andrey Lepikhov and Etsuro Fujita Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru
* Use the "pg_temp" schema alias in EXPLAIN and related output.Tom Lane2021-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch causes EXPLAIN output to refer to objects that are in the current session's temp schema with the "pg_temp" schema alias rather than that schema's actual name. This is useful for our own testing purposes since it will stabilize EXPLAIN VERBOSE output for such cases, allowing us to use that in regression tests. It should be less confusing for end users too. Since ruleutils.c needs to change behavior for this, the change also leaks into a few other users of ruleutils.c, for example pg_get_viewdef(). AFAICS that won't cause any problems. We did find that aggressively trying to change this behavior across-the-board would cause issues, but as long as "pg_temp" only appears within generated SQL text, I think it'll be fine. Along the way, make get_namespace_name_or_temp conform to the same API as get_namespace_name, ie that it returns a palloc'd string or NULL. The current behavior hasn't caused any bugs since no callers attempt to pfree the result, but if it gets more widespread usage that could become a problem. Amul Sul, reviewed and extended by me Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
* Fix crash in postgres_fdw for provably-empty remote UPDATE/DELETE.Tom Lane2021-07-07
| | | | | | | | | | | | | In 86dc90056, I'd written find_modifytable_subplan with the assumption that if the immediate child of a ModifyTable is a Result, it must be a projecting Result with a subplan. However, if the UPDATE or DELETE has a provably-constant-false WHERE clause, that's not so: we'll generate a dummy subplan with a childless Result. Add the missing null-check so we don't crash on such cases. Per report from Alexander Pyhalov. Discussion: https://postgr.es/m/b9a6f53549456b2f3e2fd150dcd79d72@postgrespro.ru
* postgres_fdw: Tighten up allowed values for batch_size, fetch_size options.Fujii Masao2021-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously the values such as '100$%$#$#', '9,223,372,' were accepted and treated as valid integers for postgres_fdw options batch_size and fetch_size. Whereas this is not the case with fdw_startup_cost and fdw_tuple_cost options for which an error is thrown. This was because endptr was not used while converting strings to integers using strtol. This commit changes the logic so that it uses parse_int function instead of strtol as it serves the purpose by returning false in case if it is unable to convert the string to integer. Note that this function also rounds off the values such as '100.456' to 100 and '100.567' or '100.678' to 101. While on this, use parse_real for fdw_startup_cost and fdw_tuple_cost options. Since parse_int and parse_real are being used for reloptions and GUCs, it is more appropriate to use in postgres_fdw rather than using strtol and strtod directly. Back-patch to v14. Author: Bharath Rupireddy Reviewed-by: Ashutosh Bapat, Tom Lane, Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA@mail.gmail.com
* Avoid doing catalog lookups in postgres_fdw's conversion_error_callback.Tom Lane2021-07-06
| | | | | | | | | | | | | | | | | | | | As in 50371df26, this is a bad idea since the callback can't really know what error is being thrown and thus whether or not it is safe to attempt catalog accesses. Rather than pushing said accesses into the mainline code where they'd usually be a waste of cycles, we can look at the query's rangetable instead. This change does mean that we'll be printing query aliases (if any were used) rather than the table or column's true name. But that doesn't seem like a bad thing: it's certainly a more useful definition in self-join cases, for instance. In any case, it seems unlikely that any applications would be depending on this detail, so it seems safe to change. Patch by me. Original complaint by Andres Freund; Bharath Rupireddy noted the connection to conversion_error_callback. Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
* Adjust batch size in postgres_fdw to not use too many parametersTomas Vondra2021-06-08
| | | | | | | | | | | | | | | | The FE/BE protocol identifies parameters with an Int16 index, which limits the maximum number of parameters per query to 65535. With batching added to postges_fdw this limit is much easier to hit, as the whole batch is essentially a single query, making this error much easier to hit. The failures are a bit unpredictable, because it also depends on the number of columns in the query. So instead of just failing, this patch tweaks the batch_size to not exceed the maximum number of parameters. Reported-by: Hou Zhijie <houzj.fnst@cn.fujitsu.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB571603973C0AC2874AD6BF2594299%40OS0PR01MB5716.jpnprd01.prod.outlook.com
* Fix postgres_fdw failure with whole-row Vars of type RECORD.Tom Lane2021-06-04
| | | | | | | | | | | | | | | | | | | | | Commit 86dc90056 expects that FDWs can cope with whole-row Vars for their tables, even if the Vars are marked with vartype RECORDOID. Previously, whole-row Vars generated by the planner had vartype equal to the relevant table's rowtype OID. (The point behind this change is to enable sharing of resjunk columns across inheritance child tables.) It turns out that postgres_fdw fails to cope with this, though through bad fortune none of its test cases exposed that. Things mostly work, but when we try to read back a value of such a Var, the expected rowtype is not available to record_in(). Fortunately, it's not difficult to hack up the tupdesc that controls this process to substitute the foreign table's rowtype for RECORDOID. Thus we can solve the runtime problem while still sharing the resjunk column with other tables. Per report from Alexander Pyhalov. Discussion: https://postgr.es/m/7817fb9ebd6661cdf9b67dec6e129a78@postgrespro.ru
* Prevent asynchronous execution of direct foreign-table modifications.Etsuro Fujita2021-05-13
| | | | | | | | | | | | | | | | Commits 27e1f1456 and 86dc90056, which were independently discussed, cause a crash when executing an inherited foreign UPDATE/DELETE query with asynchronous execution enabled, where children of an Append node that is the direct/indirect child of the ModifyTable node are rewritten so as to modify foreign tables directly by postgresPlanDirectModify(); as in that case the direct modifications are executed asynchronously, which is not currently supported by asynchronous execution. Fix by disabling asynchronous execution of the direct modifications in that function. Author: Etsuro Fujita Reviewed-by: Amit Langote Discussion: https://postgr.es/m/CAPmGK158e9sJOfuWxfn%2B0ynrspXQU3JhNjSCbaoeSzMvnga%2Bbw%40mail.gmail.com
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Fix EXPLAIN ANALYZE for async-capable nodes.Etsuro Fujita2021-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | EXPLAIN ANALYZE for an async-capable ForeignScan node associated with postgres_fdw is done just by using instrumentation for ExecProcNode() called from the node's callbacks, causing the following problems: 1) If the remote table to scan is empty, the node is incorrectly considered as "never executed" by the command even if the node is executed, as ExecProcNode() isn't called from the node's callbacks at all in that case. 2) The command fails to collect timings for things other than ExecProcNode() done in the node, such as creating a cursor for the node's remote query. To fix these problems, add instrumentation for async-capable nodes, and modify postgres_fdw accordingly. My oversight in commit 27e1f1456. While at it, update a comment for the AsyncRequest struct in execnodes.h and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix typos in comments in nodeAppend.c. Per report from Andrey Lepikhov, though I didn't use his patch. Reviewed-by: Andrey Lepikhov Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
* Copy the INSERT query in postgres_fdwTomas Vondra2021-05-07
| | | | | | | | | | | When executing the INSERT with batching, we may need to rebuild the query when the batch size changes, in which case we pfree the current string. We must not release the original string, stored in fdw_private, because that may be needed in EXPLAIN ANALYZE. So make copy of the SQL, but only for INSERT queries. Reported-by: Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRCL_Rjw-MCR6J7VX9OF7MR6PA5K8qUbrMvprW_e-aHkfQ%40mail.gmail.com
* Don't pass "ONLY" options specified in TRUNCATE to foreign data wrapper.Fujii Masao2021-04-27
| | | | | | | | | | | | | | | | | | | | | | Commit 8ff1c94649 allowed TRUNCATE command to truncate foreign tables. Previously the information about "ONLY" options specified in TRUNCATE command were passed to the foreign data wrapper. Then postgres_fdw constructed the TRUNCATE command to issue the remote server and included "ONLY" options in it based on the passed information. On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE have no effect when accessing or modifying the remote table, i.e., are not passed to the foreign data wrapper. So it's inconsistent to make only TRUNCATE command pass the "ONLY" options to the foreign data wrapper. Therefore this commit changes the TRUNCATE command so that it doesn't pass the "ONLY" options to the foreign data wrapper, for the consistency with other statements. Also this commit changes postgres_fdw so that it always doesn't include "ONLY" options in the TRUNCATE command that it constructs. Author: Fujii Masao Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com
* Minor code cleanup in asynchronous execution support.Etsuro Fujita2021-04-23
| | | | | | | | | | | | | | | This is cleanup for commit 27e1f1456: * ExecAppendAsyncEventWait(), which was modified a bit further by commit a8af856d3, duplicated the same nevents calculation. Simplify the code a little bit to avoid the duplication. Update comments there. * Add an assertion to ExecAppendAsyncRequest(). * Update a comment about merging the async_capable options from input relations in merge_fdw_options(), per complaint from Kyotaro Horiguchi. * Add a comment for fetch_more_data_begin(). Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK1637W30Wx3MnrReewhafn6F_0J76mrJGoFXFnpPq4QfvA%40mail.gmail.com
* Allow TRUNCATE command to truncate foreign tables.Fujii Masao2021-04-08
| | | | | | | | | | | | | | | | | | | | | | | This commit introduces new foreign data wrapper API for TRUNCATE. It extends TRUNCATE command so that it accepts foreign tables as the targets to truncate and invokes that API. Also it extends postgres_fdw so that it can issue TRUNCATE command to foreign servers, by adding new routine for that TRUNCATE API. The information about options specified in TRUNCATE command, e.g., ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to truncate is also passed to FDW. FDW truncates the foreign data sources that the passed foreign tables specify, based on those information. For example, postgres_fdw constructs TRUNCATE command using them and issues it to the foreign server. For performance, TRUNCATE command invokes the FDW routine for TRUNCATE once per foreign server that foreign tables to truncate belong to. Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
* postgres_fdw: Allow partitions specified in LIMIT TO to be imported.Fujii Masao2021-04-07
| | | | | | | | | | | | | | | | | | | | Commit f49bcd4ef3 disallowed postgres_fdw to import table partitions. Because all data can be accessed through the partitioned table which is the root of the partitioning hierarchy, importing only partitioned table should allow access to all the data without creating extra objects. This is a reasonable default when importing a whole schema. But there may be the case where users want to explicitly import one of a partitioned tables' partitions. For that use case, this commit allows postgres_fdw to import tables or foreign tables which are partitions of some other table only when they are explicitly specified in LIMIT TO clause. It doesn't change the behavior that any partitions not specified in LIMIT TO are automatically excluded in IMPORT FOREIGN SCHEMA command. Author: Matthias van de Meent Reviewed-by: Bernd Helmle, Amit Langote, Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/CAEze2Whwg4i=mzApMe+PXxCEfgoZmHGqdqQFW7J4bmj_5p6t1A@mail.gmail.com
* Silence compiler warning in non-assert builds.Tom Lane2021-03-31
| | | | Per buildfarm.
* Rework planning and execution of UPDATE and DELETE.Tom Lane2021-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
* Add support for asynchronous execution.Etsuro Fujita2021-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
* Allow estimate_num_groups() to pass back further details about the estimationDavid Rowley2021-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we add a new output parameter to estimate_num_groups() to allow it to inform the caller of additional, possibly useful information about the estimation. The new output parameter is a struct that currently contains just a single field with a set of flags. This was done rather than having the flags as an output parameter to allow future fields to be added without having to change the signature of the function at a later date when we want to pass back further information that might not be suitable to store in the flags field. It seems reasonable that one day in the future that the planner would want to know more about the estimation. For example, how many individual sets of statistics was the estimation generated from? The planner may want to take that into account if we ever want to consider risks as well as costs when generating plans. For now, there's only 1 flag we set in the flags field. This is to indicate if the estimation fell back on using the hard-coded constants in any part of the estimation. Callers may like to change their behavior if this is set, and this gives them the ability to do so. Callers may pass the flag pointer as NULL if they have no interest in obtaining any additional information about the estimate. We're not adding any actual usages of these flags here. Some follow-up commits will make use of this feature. Additionally, we're also not making any changes to add support for clauselist_selectivity() and clauselist_selectivity_ext(). However, if this is required in the future then the same struct being added here should be fine to use as a new output argument for those functions too. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvqQqpk=1W-G_ds7A9CsXX3BggWj_7okinzkLVhDubQzjA@mail.gmail.com
* Update obsolete comment.Etsuro Fujita2021-03-30
| | | | | | | Back-patch to all supported branches. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK17DwzaSf%2BB71dhL2apXdtG-OmD6u2AL9Cq2ZmAR0%2BzapQ%40mail.gmail.com
* Fix tuple routing to initialize batching only for insertsTomas Vondra2021-02-18
| | | | | | | | | | | | | | | | | | A cross-partition update on a partitioned table is implemented as a delete followed by an insert. With foreign partitions, this was however causing issues, because the FDW and core may disagree on when to enable batching. postgres_fdw was only allowing batching for plain inserts (CMD_INSERT) while core was trying to batch the insert component of the cross-partition update. Fix by restricting core to apply batching only to plain CMD_INSERT queries. It's possible to allow batching for cross-partition updates, but that will require more extensive changes, so better to leave that for a separate patch. Author: Amit Langote Reviewed-by: Tomas Vondra, Takayuki Tsunakawa Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development