aboutsummaryrefslogtreecommitdiff
path: root/src/backend/replication/logical/worker.c
Commit message (Collapse)AuthorAge
* Detect and report update_deleted conflicts.Amit Kapila37 hours
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enhancement builds upon the infrastructure introduced in commit 228c370868, which enables the preservation of deleted tuples and their origin information on the subscriber. This capability is crucial for handling concurrent transactions replicated from remote nodes. The update introduces support for detecting update_deleted conflicts during the application of update operations on the subscriber. When an update operation fails to locate the target row-typically because it has been concurrently deleted-we perform an additional table scan. This scan uses the SnapshotAny mechanism and we do this additional scan only when the retain_dead_tuples option is enabled for the relevant subscription. The goal of this scan is to locate the most recently deleted tuple-matching the old column values from the remote update-that has not yet been removed by VACUUM and is still visible according to our slot (i.e., its deletion is not older than conflict-detection-slot's xmin). If such a tuple is found, the system reports an update_deleted conflict, including the origin and transaction details responsible for the deletion. This provides a groundwork for more robust and accurate conflict resolution process, preventing unexpected behavior by correctly identifying cases where a remote update clashes with a deletion from another origin. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Preserve conflict-relevant data during logical replication.Amit Kapila14 days
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Logical replication requires reliable conflict detection to maintain data consistency across nodes. To achieve this, we must prevent premature removal of tuples deleted by other origins and their associated commit_ts data by VACUUM, which could otherwise lead to incorrect conflict reporting and resolution. This patch introduces a mechanism to retain deleted tuples on the subscriber during the application of concurrent transactions from remote nodes. Retaining these tuples allows us to correctly ignore concurrent updates to the same tuple. Without this, an UPDATE might be misinterpreted as an INSERT during resolutions due to the absence of the original tuple. Additionally, we ensure that origin metadata is not prematurely removed by vacuum freeze, which is essential for detecting update_origin_differs and delete_origin_differs conflicts. To support this, a new replication slot named pg_conflict_detection is created and maintained by the launcher on the subscriber. Each apply worker tracks its own non-removable transaction ID, which the launcher aggregates to determine the appropriate xmin for the slot, thereby retaining necessary tuples. Conflict information retention (deleted tuples and commit_ts) can be enabled per subscription via the retain_conflict_info option. This is disabled by default to avoid unnecessary overhead for configurations that do not require conflict resolution or logging. During upgrades, if any subscription on the old cluster has retain_conflict_info enabled, a conflict detection slot will be created to protect relevant tuples from deletion when the new cluster starts. This is a foundational work to correctly detect update_deleted conflict which will be done in a follow-up patch. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Standardize LSN formatting by zero paddingÁlvaro Herrera2025-07-07
| | | | | | | | | | | | | | | | This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
* Fix missing comment update in 1462aad2e4.Amit Kapila2025-06-24
| | | | | | | | | Remove the part of comment that says we don't allow toggling two_phase option as that is supported in commit 1462aad2e4. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB1496656725F3951AEE8749EBDF579A@OSCPR01MB14966.jpnprd01.prod.outlook.com
* Ensure we have a snapshot when updating various system catalogs.Nathan Bossart2025-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, fixing older versions doesn't seem worth the trouble of significantly modifying the patch. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not back-patched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13
* Fix an oversight in 3f28b2fcac.Amit Kapila2025-04-23
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 3f28b2fcac tried to ensure that the replication origin shouldn't be advanced in case of an ERROR in the apply worker, so that it can request the same data again after restart. However, it is possible that an ERROR was caught and handled by a (say PL/pgSQL) function, and the apply worker continues to apply further changes, in which case, we shouldn't reset the replication origin. Ensure to reset the origin only when the apply worker exits after an ERROR. Commit 3f28b2fcac added new function geterrlevel, which we removed in HEAD as part of this commit, but kept it in backbranches to avoid breaking any applications. A separate case can be made to have such a function even for HEAD. Reported-by: Shawn McCoy <shawn.the.mccoy@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
* Fix uninitialized index information access during apply.Amit Kapila2025-04-08
| | | | | | | | | | | | | | | | | | | | | The issue happens when building conflict information during apply of INSERT or UPDATE operations that violate unique constraints on leaf partitions. The problem was introduced in commit 9ff68679b5, which removed the redundant calls to ExecOpenIndices/ExecCloseIndices. The previous code was relying on the redundant ExecOpenIndices call in apply_handle_tuple_routing() to build the index information required for unique key conflict detection. The fix is to delay building the index information until a conflict is detected instead of relying on ExecOpenIndices to do the same. The additional benefit of this approach is that it avoids building index information when there is no conflict. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by:Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB57244ADA33DDA57119B9D26494A62@TYAPR01MB5724.jpnprd01.prod.outlook.com
* Detect and Log multiple_unique_conflicts type conflict.Amit Kapila2025-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new conflict type, multiple_unique_conflicts, to handle cases where an incoming row during logical replication violates multiple UNIQUE constraints. Previously, the apply worker detected and reported only the first encountered key conflict (insert_exists/update_exists), causing repeated failures as each constraint violation needs to be handled one by one making the process slow and error-prone. With this patch, the apply worker checks all unique constraints upfront once the first key conflict is detected and reports multiple_unique_conflicts if multiple violations exist. This allows users to resolve all conflicts at once by deleting all conflicting tuples rather than dealing with them individually or skipping the transaction. In the future, this will also allow us to specify different resolution handlers for such a conflict type. Add the stats for this conflict type in pg_stat_subscription_stats. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
* Fix crash in brininsertcleanup during logical replication.Tom Lane2025-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Logical replication crashes if the subscriber's partitioned table has a BRIN index. There are two independently blamable causes, and this patch fixes both: 1. brininsertcleanup fails if called twice for the same IndexInfo, because it half-destroys its BrinInsertState but leaves it still linked from ii_AmCache. brininsert would also fail in that state, so it's pretty hard to see any advantage to this coding. Fully remove the BrinInsertState, instead, so that a new brininsert call would create a new cache. 2. A logical replication subscriber sometimes does ExecOpenIndices twice on the same ResultRelInfo, followed by doing ExecCloseIndices twice; the second call reaches the brininsertcleanup bug. Quite aside from tickling unexpected cases in aminsertcleanup methods, this seems very wasteful, because the IndexInfos built in the first ExecOpenIndices call are just lost during the second call, and have to be rebuilt at possibly-nontrivial cost. We should establish a coding rule that you don't do that. The problematic coding is that when the target table is partitioned, apply_handle_tuple_routing calls ExecFindPartition which does ExecOpenIndices (and expects that ExecCleanupTupleRouting will close the indexes again). Using the ResultRelInfo made by ExecFindPartition, it calls apply_handle_delete_internal or apply_handle_insert_internal, both of which think they need to do ExecOpenIndices/ExecCloseIndices for themselves. They do in the main non-partitioned code paths, but not here. The simplest fix is to pull their ExecOpenIndices/ExecCloseIndices calls out and put them in the call sites for the non-partitioned cases. (We could have refactored apply_handle_update_internal similarly, but I did not do so today because there's no bug there: the partitioned code path doesn't call it.) Also, remove the always-duplicative open/close calls within apply_handle_tuple_routing itself. Since brininsertcleanup and indeed the whole aminsertcleanup mechanism are new in v17, there's no observable bug in older branches. A case could be made for trying to avoid these duplicative open/close calls in the older branches, but for now it seems not worth the trouble and risk of new bugs. Bug: #18815 Reported-by: Sergey Belyashov <sergey.belyashov@gmail.com> Discussion: https://postgr.es/m/18815-2a0407cc7f40b327@postgresql.org Backpatch-through: 17
* Track unpruned relids to avoid processing pruned relationsAmit Langote2025-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit introduces changes to track unpruned relations explicitly, making it possible for top-level plan nodes, such as ModifyTable and LockRows, to avoid processing partitions pruned during initial pruning. Scan-level nodes, such as Append and MergeAppend, already avoid the unnecessary processing by accessing partition pruning results directly via part_prune_index. In contrast, top-level nodes cannot access pruning results directly and need to determine which partitions remain unpruned. To address this, this commit introduces a new bitmapset field, es_unpruned_relids, which the executor uses to track the set of unpruned relations. This field is referenced during plan initialization to skip initializing certain nodes for pruned partitions. It is initialized with PlannedStmt.unprunableRelids, a new field that the planner populates with RT indexes of relations that cannot be pruned during runtime pruning. These include relations not subject to partition pruning and those required for execution regardless of pruning. PlannedStmt.unprunableRelids is computed during set_plan_refs() by removing the RT indexes of runtime-prunable relations, identified from PartitionPruneInfos, from the full set of relation RT indexes. ExecDoInitialPruning() then updates es_unpruned_relids by adding partitions that survive initial pruning. To support this, PartitionedRelPruneInfo and PartitionedRelPruningData now include a leafpart_rti_map[] array that maps partition indexes to their corresponding RT indexes. The former is used in set_plan_refs() when constructing unprunableRelids, while the latter is used in ExecDoInitialPruning() to convert partition indexes returned by get_matching_partitions() into RT indexes, which are then added to es_unpruned_relids. These changes make it possible for ModifyTable and LockRows nodes to process only relations that remain unpruned after initial pruning. ExecInitModifyTable() trims lists, such as resultRelations, withCheckOptionLists, returningLists, and updateColnosLists, to consider only unpruned partitions. It also creates ResultRelInfo structs only for these partitions. Similarly, child RowMarks for pruned relations are skipped. By avoiding unnecessary initialization of structures for pruned partitions, these changes improve the performance of updates and deletes on partitioned tables during initial runtime pruning. Due to ExecInitModifyTable() changes as described above, EXPLAIN on a plan for UPDATE and DELETE that uses runtime initial pruning no longer lists partitions pruned during initial pruning. Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
* Fix grammatical typos around possessive "its"John Naylor2025-01-29
| | | | | | | | Some places spelled it "it's", which is short for "it is". In passing, fix a couple other nearby grammatical errors. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Rewrite maybe_reread_subscription() commentÁlvaro Herrera2024-12-13
| | | | One sentence was gramatically wrong, but also too terse. Expand on it.
* Simplify IsIndexUsableForReplicaIdentityFull()Peter Eisentraut2024-12-04
| | | | | | | | | | Take Relation as argument instead of IndexInfo. Building the IndexInfo is an unnecessary intermediate step here. A future patch wants to get some information that is in the relcache but not in IndexInfo, so this will also help there. Discussion: https://www.postgresql.org/message-id/333d3886-b737-45c3-93f4-594c96bb405d@eisentraut.org
* Improve assertion in FindReplTupleInLocalRel().Amit Kapila2024-09-11
| | | | | | | | | | | | | | | | The first part of the assertion verifying that the passed index must be PK or RI was incorrectly passing index relation instead of heap relation in GetRelationIdentityOrPK(). The assertion was not failing because the second part of the assertion which needs to be performed only when remote relation has REPLICA_IDENTITY_FULL set was also incorrect. The change is not backpatched because the current coding doesn't lead to any failure. Reported-by: Dilip Kumar Author: Amit Kapila Reviewed-by: Vignesh C Discussion: https://postgr.es/m/CAFiTN-tmguaT1DXbCC+ZomZg-oZLmU6BPhr0po7akQSG6vNJrg@mail.gmail.com
* Rename the conflict types for the origin differ cases.Amit Kapila2024-08-29
| | | | | | | | | | | The conflict types 'update_differ' and 'delete_differ' indicate that a row to be modified was previously altered by another origin. Rename those to 'update_origin_differs' and 'delete_origin_differs' to clarify their meaning. Author: Hou Zhijie Reviewed-by: Shveta Malik, Peter Smith Discussion: https://postgr.es/m/CAA4eK1+HEKwG_UYt4Zvwh5o_HoCKCjEGesRjJX38xAH3OxuuYA@mail.gmail.com
* Don't advance origin during apply failure.Amit Kapila2024-08-21
| | | | | | | | | | | | | | We advance origin progress during abort on successful streaming and application of ROLLBACK in parallel streaming mode. But the origin shouldn't be advanced during an error or unsuccessful apply due to shutdown. Otherwise, it will result in a transaction loss as such a transaction won't be sent again by the server. Reported-by: Hou Zhijie Author: Hayato Kuroda and Shveta Malik Reviewed-by: Amit Kapila Backpatch-through: 16 Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com
* Log the conflicts while applying changes in logical replication.Amit Kapila2024-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides the additional logging information in the following conflict scenarios while applying changes: insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint. update_differ: Updating a row that was previously modified by another origin. update_exists: The updated row value violates a NOT DEFERRABLE unique constraint. update_missing: The tuple to be updated is missing. delete_differ: Deleting a row that was previously modified by another origin. delete_missing: The tuple to be deleted is missing. For insert_exists and update_exists conflicts, the log can include the origin and commit timestamp details of the conflicting key with track_commit_timestamp enabled. update_differ and delete_differ conflicts can only be detected when track_commit_timestamp is enabled on the subscriber. We do not offer additional logging for exclusion constraint violations because these constraints can specify rules that are more complex than simple equality checks. Resolving such conflicts won't be straightforward. This area can be further enhanced if required. Author: Hou Zhijie Reviewed-by: Shveta Malik, Amit Kapila, Nisha Moond, Hayato Kuroda, Dilip Kumar Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Change the misleading local end_lsn for prepared transactions.Amit Kapila2024-08-09
| | | | | | | | | | | | | | | | | | | The apply worker was using XactLastCommitEnd as local end_lsn for applying prepare and rollback_prepare. The XactLastCommitEnd value is the end lsn of the last commit applied before the prepare transaction which makes no sense. This LSN is used to decide whether we can send the acknowledgment of the corresponding remote LSN to the server. It is okay not to set the local_end LSN with the actual WAL position for the prepare because we always flush the prepare record. So, we can send the acknowledgment of the remote_end LSN as soon as prepare is finished. The current code is misleading but as such doesn't create any problem, so decided not to backpatch. Author: Hayato Kuroda Reviewed-by: Shveta Malik, Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB5692FA4926754B91E9D7B5F0F5AA2@TYAPR01MB5692.jpnprd01.prod.outlook.com
* Avoid duplicate table scans for cross-partition updates during logical ↵Amit Kapila2024-08-01
| | | | | | | | | | | | | | replication. When performing a cross-partition update in the apply worker, it needlessly scans the old partition twice, resulting in noticeable overhead. This commit optimizes it by removing the redundant table scan. Author: Hou Zhijie Reviewed-by: Hayato Kuroda, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB571623E39984D94CBB5341D994AB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Allow altering of two_phase option of a SUBSCRIPTION.Amit Kapila2024-07-24
| | | | | | | | | | | | | | | | | | | | The two_phase option is controlled by both the publisher (as a slot option) and the subscriber (as a subscription option), so the slot option must also be modified. Changing the 'two_phase' option for a subscription from 'true' to 'false' is permitted only when there are no pending prepared transactions corresponding to that subscription. Otherwise, the changes of already prepared transactions can be replicated again along with their corresponding commit leading to duplicate data or errors. To avoid data loss, the 'two_phase' option for a subscription can only be changed from 'false' to 'true' once the initial data synchronization is completed. Therefore this is performed later by the logical replication worker. Author: Hayato Kuroda, Ajin Cherian, Amit Kapila Reviewed-by: Peter Smith, Hou Zhijie, Amit Kapila, Vitaly Davydov, Vignesh C Discussion: https://postgr.es/m/8fab8-65d74c80-1-2f28e880@39088166
* Improve logical replication connection-failure messages.Tom Lane2024-07-11
| | | | | | | | | | These messages mostly said "could not connect to the publisher: %s" which is lacking context. Add some verbiage to indicate which subscription or worker process is failing. Nisha Moond Discussion: https://postgr.es/m/CABdArM7q1=zqL++cYd0hVMg3u_tc0S=0Of=Um-KvDhLony0cSg@mail.gmail.com
* Convert some extern variables to staticPeter Eisentraut2024-07-02
| | | | | | | | These probably should have been static all along, it was only forgotten out of sloppiness. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
* Remove unused #include's from backend .c filesPeter Eisentraut2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
* Enhance libpqrcv APIs to support slot synchronization.Amit Kapila2024-02-05
| | | | | | | | | | | | | | | | | This patch provides support for regular (non-replication) connections in libpqrcv_connect(). This can be used to execute SQL statements on the primary server without starting a walsender. A new API libpqrcv_get_dbname_from_conninfo() is also added to extract the database name from the given connection-info. Note that this patch doesn't change any existing functionality but later patches implementing the slot synchronization will use this functionality to connect to the primary server to fetch required slot information. Author: Shveta Malik, Hou Zhijie, Ajin Cherian Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
* Add a failover option to subscriptions.Amit Kapila2024-01-30
| | | | | | | | | | | | | | | | | | | | | This commit introduces a new subscription option named 'failover', which provides users with the ability to set the failover property of the replication slot on the publisher when creating or altering a subscription. This uses the replication commands introduced by commit 7329240437 to enable the failover option for a logical replication slot. If the failover option is set to true, the associated replication slots (i.e. the main slot and the table sync slots) in the upstream database are enabled to be synchronized to the standbys. Note that the capability to sync the replication slots will be added in subsequent commits. Thanks to Masahiko Sawada for the design inputs. Author: Shveta Malik, Hou Zhijie, Ajin Cherian Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
* Fix some typosMichael Paquier2024-01-22
| | | | | Author: Yongtao Huang Discussion: https://postgr.es/m/CAOe1Go1F99o5JsphtXdDC5bxm7AzetU8q3AxLh4AAVGKu1AzEQ@mail.gmail.com
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Retire MemoryContextResetAndDeleteChildren() macro.Nathan Bossart2023-11-15
| | | | | | | | | | | | | | | | | As of commit eaa5808e8e, MemoryContextResetAndDeleteChildren() is just a backwards compatibility macro for MemoryContextReset(). Now that some time has passed, this macro seems more likely to create confusion. This commit removes the macro and replaces all remaining uses with calls to MemoryContextReset(). Any third-party code that use this macro will need to be adjusted to call MemoryContextReset() instead. Since the two have behaved the same way since v9.5, such adjustments won't produce any behavior changes for all currently-supported versions of PostgreSQL. Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13
* Make use of initReadOnlyStringInfo() in more placesDavid Rowley2023-11-07
| | | | | | | | | | | | | | | | | | | | | | f0efa5aec introduced the concept of "read-only" StringInfos which makes use of an existing, possibly not NUL terminated, buffer. Here we adjust two places that make use of StringInfos to receive data to avoid using appendBinaryStringInfo() in cases where a NUL termination character is not required. This saves a possible palloc() and saves having to needlessly memcpy() from one buffer to another. Here we adjust two places which were using appendBinaryStringInfo(). Neither of these cases seem particularly performance-critical. In the case of XLogWalRcvProcessMsg(), the appendBinaryStringInfo() was only appending 24 bytes. The change made here does mean that we can get rid of the incoming_message global variable and make that local instead. The apply_spooled_messages() case applies in logical decoding when applying (possibly large) changes which have been serialized to a file. Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAApHDvoxYUDHwqPf-ShvchsERf1RzmkGoLwg63JNvHCkDCuyKQ@mail.gmail.com
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Introduce the concept of read-only StringInfosDavid Rowley2023-10-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were various places in our codebase which conjured up a StringInfo by manually assigning the StringInfo fields and setting the data field to point to some existing buffer. There wasn't much consistency here as to what fields like maxlen got set to and in one location we didn't correctly ensure that the buffer was correctly NUL terminated at len bytes, as per what was documented as required in stringinfo.h Here we introduce 2 new functions to initialize StringInfos. One allows callers to initialize a StringInfo passing along a buffer that is already allocated by palloc. Here the StringInfo code uses this buffer directly rather than doing any memcpying into a new allocation. Having this as a function allows us to verify the buffer is correctly NUL terminated. StringInfos initialized this way can be appended to and reset just like any other normal StringInfo. The other new initialization function also accepts an existing buffer, but the given buffer does not need to be a pointer to a palloc'd chunk. This buffer could be a pointer pointing partway into some palloc'd chunk or may not even be palloc'd at all. StringInfos initialized this way are deemed as "read-only". This means that it's not possible to append to them or reset them. For the latter of the two new initialization functions mentioned above, we relax the requirement that the data buffer must be NUL terminated. Relaxing this requirement is convenient in a few places as it can save us from having to allocate an entire new buffer just to add the NUL terminator or save us from having to temporarily add a NUL only to have to put the original char back again later. Incompatibility note: Here we also forego adding the NUL in a few places where it does not seem to be required. These locations are passing the given StringInfo into a type's receive function. It does not seem like any of our built-in receive functions require this, but perhaps there's some UDT out there in the wild which does require this. It is likely worthy of a mention in the release notes that a UDT's receive function mustn't rely on the input StringInfo being NUL terminated. Author: David Rowley Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvorfO3iBZ%3DxpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ%40mail.gmail.com
* Restart the apply worker if the privileges have been revoked.Amit Kapila2023-10-17
| | | | | | | | | | | Restart the apply worker if the subscription owner's superuser privileges have been revoked. This is required so that the subscription connection string gets revalidated and use the password option to connect to the publisher for non-superusers, if required. Author: Vignesh C Reviewed-by: Amit Kapila Discussion: http://postgr.es/m/CALDaNm2Dxmhq08nr4P6G+24QvdBo_GAVyZ_Q1TcGYK+8NHs9xw@mail.gmail.com
* Simplify the logical worker type checks by using the switch on worker type.Amit Kapila2023-08-22
| | | | | | | | | | The current code uses if/else statements at various places to take worker specific actions. Change those to use the switch on worker type added by commit 2a8b40e368. This makes code easier to read and understand. Author: Peter Smith Reviewed-by: Amit Kapila, Hou Zhijie Discussion: http://postgr.es/m/CAHut+PttPSuP0yoZ=9zLDXKqTJ=d0bhxwKaEaNcaym1XqcvDEg@mail.gmail.com
* Simplify some of the logical replication worker-type checks.Amit Kapila2023-08-04
| | | | | | Author: Peter Smith Reviewed-by: Hou Zhijie Discussion: http://postgr.es/m/CAHut+Pv-xkEpuPzbEJ=ZSi7Hp2RoGJf=VA-uDRxLi1KHSneFjg@mail.gmail.com
* Refactor to split Apply and Tablesync Workers code.Amit Kapila2023-08-03
| | | | | | | | | | | | | | | | | | | | | Both apply and tablesync workers were using ApplyWorkerMain() as entry point. As the name implies, ApplyWorkerMain() should be considered as the main function for apply workers. Tablesync worker's path was hidden and does not have enough in common to share the same main function with apply worker. Also, most of the code shared by both worker types is already combined in LogicalRepApplyLoop(). There is no need to combine the rest in ApplyWorkerMain() anymore. This patch introduces TablesyncWorkerMain() as a new entry point for tablesync workers. This aims to increase code readability and would help with future improvements like the reuse of tablesync workers in the initial synchronization. Author: Melih Mutlu based on suggestions by Melanie Plageman Reviewed-by: Peter Smith, Kuroda Hayato, Amit Kapila Discussion: http://postgr.es/m/CAGPVpCTq=rUDd4JUdaRc1XUWf4BrH2gdSNf3rtOMUGj9rPpfzQ@mail.gmail.com
* Remove unnecessary checks for indexes for REPLICA IDENTITY FULL tables.Masahiko Sawada2023-07-25
| | | | | | | | | | | | | | | | | | | | | | Previously, when selecting an usable index for update/delete for the REPLICA IDENTITY FULL table, in IsIndexOnlyExpression(), we used to check if all index fields are not expressions. However, it was not necessary, because it is enough to check if only the leftmost index field is not an expression (and references the remote table column) and this check has already been done by RemoteRelContainsLeftMostColumnOnIdx(). This commit removes IsIndexOnlyExpression() and RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable indexes for REPLICA IDENTITY FULL tables are now performed by IsIndexUsableForReplicaIdentityFull(). Backpatch this to remain the code consistent. Reported-by: Peter Smith Reviewed-by: Amit Kapila, Önder Kalacı Discussion: https://postgr.es/m/CAHut%2BPsGRE5WSsY0jcLHJEoA17MrbP9yy8FxdjC_ZOAACxbt%2BQ%40mail.gmail.com Backpatch-through: 16
* Fix the display of UNKNOWN message type in apply worker.Amit Kapila2023-07-25
| | | | | | | | | | | | | | We include the message type while displaying an error context in the apply worker. Now, while retrieving the message type string if the message type is unknown we throw an error that will hide the original error. So, instead, we need to simply return the string indicating an unknown message type. Reported-by: Ashutosh Bapat Author: Euler Taveira, Amit Kapila Reviewed-by: Ashutosh Bapat Backpatch-through: 15 Discussion: https://postgr.es/m/CAExHW5suAEDW-mBZt_qu4RVxWZ1vL54-L+ci2zreYWebpzxYsA@mail.gmail.com
* Fix untranslatable log message assemblyPeter Eisentraut2023-07-13
| | | | | | | | | We can't inject the name of the logical replication worker into a log message like that. But for these messages we don't really need the precision of knowing what kind of worker it was, so just write "logical replication worker" and keep the message in one piece. Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPt1xwATviPGjjtJy5L631SGf3qjV9XUCmxLu16cHamfgg%40mail.gmail.com
* Reload configuration more frequently in apply worker.Amit Kapila2023-06-07
| | | | | | | | | | | | | | | | The apply worker was not reloading the configuration while processing messages if there is a continuous flow of messages from upstream. It was also not reloading the configuration if there is a change in the configuration after it has waited for the message and before receiving the new replication message. This can lead to failure in tests because we expect that after reload, the behavior of apply worker to respect the changed GUCs. We found this while analyzing a rare buildfarm failure. Author: Hou Zhijie Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB5716AF9079CC0755CD015322947E9@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Pre-beta mechanical code beautification.Tom Lane2023-05-19
| | | | | | | | | | | | | | | Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
* Fix misbehavior of EvalPlanQual checks with multiple result relations.Tom Lane2023-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea of EvalPlanQual is that we replace the query's scan of the result relation with a single injected tuple, and see if we get a tuple out, thereby implying that the injected tuple still passes the query quals. (In join cases, other relations in the query are still scanned normally.) This logic was not updated when commit 86dc90056 made it possible for a single DML query plan to have multiple result relations, when the query target relation has inheritance or partition children. We replaced the output for the current result relation successfully, but other result relations were still scanned normally; thus, if any other result relation contained a tuple satisfying the quals, we'd think the EPQ check passed, even if it did not pass for the injected tuple itself. This would lead to update or delete actions getting performed when they should have been skipped due to a conflicting concurrent update in READ COMMITTED isolation mode. Fix by blocking all sibling result relations from emitting tuples during an EvalPlanQual recheck. In the back branches, the fix is complicated a bit by the need to not change the size of struct EPQState (else we'd have ABI-breaking changes in offsets in struct ModifyTableState). Like the back-patches of 3f7836ff6 and 4b3e37993, add a separately palloc'd struct to avoid that. The logic is the same as in HEAD otherwise. This is only a live bug back to v14 where 86dc90056 came in. However, I chose to back-patch the test cases further, on the grounds that this whole area is none too well tested. I skipped doing so in v11 though because none of the test applied cleanly, and it didn't quite seem worth extra work for a branch with only six months to live. Per report from Ante Krešić (via Aleksander Alekseev) Discussion: https://postgr.es/m/CAJ7c6TMBTN3rcz4=AjYhLPD_w3FFT0Wq_C15jxCDn8U4tZnH1g@mail.gmail.com
* Fix assertion failure in apply worker.Amit Kapila2023-05-03
| | | | | | | | | | | | | | During exit, the logical replication apply worker tries to release session level locks, if any. However, if the apply worker exits due to an error before its connection is initialized, trying to release locks can lead to assertion failure. The locks will be acquired once the worker is initialized, so we don't need to release them till the worker initialization is complete. Reported-by: Alexander Lakhin Author: Hou Zhijie based on inputs from Sawada Masahiko and Amit Kapila Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/2185d65f-5aae-3efa-c48f-fb42b173ef5c@gmail.com
* Use elog to report unexpected action in handle_streamed_transaction().Masahiko Sawada2023-04-24
| | | | | | | | An oversight in commit 216a784829c. Author: Masahiko Sawada Reviewed-by: Kyotaro Horiguchi, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDDbM8_HJt-nMCvcjTK8K9hPzXWqJj7pyaUvR4mm_NrSg@mail.gmail.com
* Restart the apply worker if the 'password_required' option is changed.Amit Kapila2023-04-20
| | | | | | | | | | | | The apply worker is restarted if any subscription option that affects the remote connection was changed. In commit c3afe8cf5a, we added the option 'password_required' which can affect the remote connection, so we should restart the worker if it was changed. Author: Amit Kapila Reviewed-by: Robert Haas Discussion: https://postgr.es/m/CAA4eK1+z9UDFEynXLsWeMMuUZc1iQkRwj2HNDtxUHTPo-u1F4A@mail.gmail.com Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53@enterprisedb.com
* Add a run_as_owner option to subscriptions.Robert Haas2023-04-04
| | | | | | | | | | | | | | | | | | This option is normally false, but can be set to true to obtain the legacy behavior where the subscription runs with the permissions of the subscription owner rather than the permissions of the table owner. The advantages of this mode are (1) it doesn't require that the subscription owner have permission to SET ROLE to each table owner and (2) since no role switching occurs, the SECURITY_RESTRICTED_OPERATION restrictions do not apply. On the downside, it allows any table owner to easily usurp the privileges of the subscription owner - basically, to take over their account. Because that's generally quite undesirable, we don't make this mode the default, but we do make it available, just in case the new behavior causes too many problems for someone. Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
* Perform logical replication actions as the table owner.Robert Haas2023-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now, logical replication actions have been performed as the subscription owner, who will generally be a superuser. Commit cec57b1a0fbcd3833086ba686897c5883e0a2afc documented hazards associated with that situation, namely, that any user who owns a table on the subscriber side could assume the privileges of the subscription owner by attaching a trigger, expression index, or some other kind of executable code to it. As a remedy, it suggested not creating configurations where users who are not fully trusted own tables on the subscriber. Although that will work, it basically precludes using logical replication in the way that people typically want to use it, namely, to replicate a database from one node to another without necessarily having any restrictions on which database users can own tables. So, instead, change logical replication to execute INSERT, UPDATE, DELETE, and TRUNCATE operations as the table owner when they are replicated. Since this involves switching the active user frequently within a session that is authenticated as the subscription user, also impose SECURITY_RESTRICTED_OPERATION restrictions on logical replication code. As an exception, if the table owner can SET ROLE to the subscription owner, these restrictions have no security value, so don't impose them in that case. Subscription owners are now required to have the ability to SET ROLE to every role that owns a table that the subscription is replicating. If they don't, replication will fail. Superusers, who normally own subscriptions, satisfy this property by default. Non-superusers users who own subscriptions will need to be granted the roles that own relevant tables. Patch by me, reviewed (but not necessarily in its entirety) by Jelte Fennema, Jeff Davis, and Noah Misch. Discussion: http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
* Fix possible logical replication crash.Robert Haas2023-04-03
| | | | | | | | | | | Commit c3afe8cf5a1e465bd71e48e4bc717f5bfdc7a7d6 added a new password_required option but forgot that you need database access to check whether an arbitrary role ID is a superuser. Report and patch by Hou Zhijie. I added a comment. Thanks to Alexander Lakhin for devising a way to reproduce the crash. Discussion: http://postgr.es/m/OS0PR01MB5716BFD7EC44284C89F40808948F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Add new predefined role pg_create_subscription.Robert Haas2023-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This role can be granted to non-superusers to allow them to issue CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE permissions on the database in which the subscription is to be created. Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP, now require only that the role performing the operation own the subscription, or inherit the privileges of the owner. However, to use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO, you also need CREATE permission on the database. This is similar to what we do for schemas. To change the owner of a schema, you must also have permission to SET ROLE to the new owner, similar to what we do for other object types. Non-superusers are required to specify a password for authentication and the remote side must use the password, similar to what is required for postgres_fdw and dblink. A superuser who wants a non-superuser to own a subscription that does not rely on password authentication may set the new password_required=false property on that subscription. A non-superuser may not set password_required=false and may not modify a subscription that already has password_required=false. This new password_required subscription property works much like the eponymous postgres_fdw property. In both cases, the actual semantics are that a password is not required if either (1) the property is set to false or (2) the relevant user is the superuser. Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger, and Stephen Frost (but some of those people did not fully endorse all of the decisions that the patch makes). Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
* Allow the use of indexes other than PK and REPLICA IDENTITY on the subscriber.Amit Kapila2023-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using REPLICA IDENTITY FULL on the publisher can lead to a full table scan per tuple change on the subscription when REPLICA IDENTITY or PK index is not available. This makes REPLICA IDENTITY FULL impractical to use apart from some small number of use cases. This patch allows using indexes other than PRIMARY KEY or REPLICA IDENTITY on the subscriber during apply of update/delete. The index that can be used must be a btree index, not a partial index, and it must have at least one column reference (i.e. cannot consist of only expressions). We can uplift these restrictions in the future. There is no smart mechanism to pick the index. If there is more than one index that satisfies these requirements, we just pick the first one. We discussed using some of the optimizer's low-level APIs for this but ruled it out as that can be a maintenance burden in the long run. This patch improves the performance in the vast majority of cases and the improvement is proportional to the amount of data in the table. However, there could be some regression in a small number of cases where the indexes have a lot of duplicate and dead rows. It was discussed that those are mostly impractical cases but we can provide a table or subscription level option to disable this feature if required. Author: Onder Kalaci, Amit Kapila Reviewed-by: Peter Smith, Shi yu, Hou Zhijie, Vignesh C, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/CACawEhVLqmAAyPXdHEPv1ssU2c=dqOniiGz7G73HfyS7+nGV4w@mail.gmail.com