aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorAmit Kapila <akapila@postgresql.org>2025-02-19 09:29:50 +0530
committerAmit Kapila <akapila@postgresql.org>2025-02-19 09:29:50 +0530
commitac0e33136abc4668c9b08e1ba7db69afe1e0e2c3 (patch)
tree6bcb9cfe3249e89dada703f6df423bee4cc5b07e /doc/src
parentb464e51ab32fbf09cf5d9c911a8e26f491ad1f44 (diff)
downloadpostgresql-ac0e33136abc4668c9b08e1ba7db69afe1e0e2c3.tar.gz
postgresql-ac0e33136abc4668c9b08e1ba7db69afe1e0e2c3.zip
Invalidate inactive replication slots.
This commit introduces idle_replication_slot_timeout GUC that allows inactive slots to be invalidated at the time of checkpoint. Because checkpoints happen checkpoint_timeout intervals, there can be some lag between when the idle_replication_slot_timeout was exceeded and when the slot invalidation is triggered at the next checkpoint. To avoid such lags, users can force a checkpoint to promptly invalidate inactive slots. Note that the idle timeout invalidation mechanism is not applicable for slots that do not reserve WAL or for slots on the standby server that are synced from the primary server (i.e., standby slots having 'synced' field 'true'). Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. The slots can become inactive for a long period if a subscriber is down due to a system error or inaccessible because of network issues. If such a situation persists, it might be more practical to recreate the subscriber rather than attempt to recover the node and wait for it to catch up which could be time-consuming. Then, external tools could create replication slots (e.g., for migrations or upgrades) that may fail to remove them if an error occurs, leaving behind unused slots that take up space and resources. Manually cleaning them up can be tedious and error-prone, and without intervention, these lingering slots can cause unnecessary WAL retention and system bloat. As the duration of idle_replication_slot_timeout is in minutes, any test using that would be time-consuming. We are planning to commit a follow up patch for tests by using the injection point framework. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/config.sgml40
-rw-r--r--doc/src/sgml/logical-replication.sgml5
-rw-r--r--doc/src/sgml/system-views.sgml7
3 files changed, 52 insertions, 0 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 336630ce417..9eedcf6f0f4 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4429,6 +4429,46 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
</listitem>
</varlistentry>
+ <varlistentry id="guc-idle-replication-slot-timeout" xreflabel="idle_replication_slot_timeout">
+ <term><varname>idle_replication_slot_timeout</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>idle_replication_slot_timeout</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Invalidate replication slots that have remained idle longer than this
+ duration. If this value is specified without units, it is taken as
+ minutes. A value of zero (the default) disables the idle timeout
+ invalidation mechanism. This parameter can only be set in the
+ <filename>postgresql.conf</filename> file or on the server command
+ line.
+ </para>
+
+ <para>
+ Slot invalidation due to idle timeout occurs during checkpoint.
+ Because checkpoints happen at <varname>checkpoint_timeout</varname>
+ intervals, there can be some lag between when the
+ <varname>idle_replication_slot_timeout</varname> was exceeded and when
+ the slot invalidation is triggered at the next checkpoint.
+ To avoid such lags, users can force a checkpoint to promptly invalidate
+ inactive slots. The duration of slot inactivity is calculated using the
+ slot's <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>inactive_since</structfield>
+ value.
+ </para>
+
+ <para>
+ Note that the idle timeout invalidation mechanism is not applicable
+ for slots that do not reserve WAL or for slots on the standby server
+ that are being synced from the primary server (i.e., standby slots
+ having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+ value <literal>true</literal>). Synced slots are always considered to
+ be inactive because they don't perform logical decoding to produce
+ changes.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
<term><varname>wal_sender_timeout</varname> (<type>integer</type>)
<indexterm>
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 613abcd28b7..3d18e507bbc 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -2391,6 +2391,11 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER
</para>
<para>
+ Logical replication slots are also affected by
+ <link linkend="guc-idle-replication-slot-timeout"><varname>idle_replication_slot_timeout</varname></link>.
+ </para>
+
+ <para>
<link linkend="guc-max-wal-senders"><varname>max_wal_senders</varname></link>
should be set to at least the same as
<varname>max_replication_slots</varname>, plus the number of physical
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index ad2903d5ac7..3f5a306247e 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2619,6 +2619,13 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
perform logical decoding. It is set only for logical slots.
</para>
</listitem>
+ <listitem>
+ <para>
+ <literal>idle_timeout</literal> means that the slot has remained
+ idle longer than the configured
+ <xref linkend="guc-idle-replication-slot-timeout"/> duration.
+ </para>
+ </listitem>
</itemizedlist>
</para></entry>
</row>