1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
|
<!-- doc/src/sgml/replication-origins.sgml -->
<chapter id="replication-origins">
<title>Replication Progress Tracking</title>
<indexterm zone="replication-origins">
<primary>Replication Progress Tracking</primary>
</indexterm>
<indexterm zone="replication-origins">
<primary>Replication Origins</primary>
</indexterm>
<para>
Replication origins are intended to make it easier to implement
logical replication solutions on top
of <xref linkend="logicaldecoding">. They provide a solution to two
common problems:
<itemizedlist>
<listitem><para>How to safely keep track of replication progress</para></listitem>
<listitem><para>How to change replication behavior, based on the
origin of a row; e.g. to avoid loops in bi-directional replication
setups</para></listitem>
</itemizedlist>
</para>
<para>
Replication origins consist out of a name and an <type>oid</type>. The name,
which is what should be used to refer to the origin across systems, is
free-form <type>text</type>. It should be used in a way that makes conflicts
between replication origins created by different replication solutions
unlikely; e.g. by prefixing the replication solution's name to it.
The <type>oid</type> is used only to avoid having to store the long version
in situations where space efficiency is important. It should never be shared
between systems.
</para>
<para>
Replication origins can be created using the
<link linkend="pg-replication-origin-create"><function>pg_replication_origin_create()</function></link>;
dropped using
<link linkend="pg-replication-origin-drop"><function>pg_replication_origin_drop()</function></link>;
and seen in the
<link linkend="catalog-pg-replication-origin"><structname>pg_replication_origin</structname></link>
catalog.
</para>
<para>
When replicating from one system to another (independent of the fact that
those two might be in the same cluster, or even same database) one
nontrivial part of building a replication solution is to keep track of
replay progress in a safe manner. When the applying process, or the whole
cluster, dies, it needs to be possible to find out up to where data has
successfully been replicated. Naive solutions to this like updating a row in
a table for every replayed transaction have problems like runtime overhead
bloat.
</para>
<para>
Using the replication origin infrastructure a session can be
marked as replaying from a remote node (using the
<link linkend="pg-replication-origin-session-setup"><function>pg_replication_origin_session_setup()</function></link>
function). Additionally the <acronym>LSN</acronym> and commit
timestamp of every source transaction can be configured on a per
transaction basis using
<link linkend="pg-replication-origin-xact-setup"><function>pg_replication_origin_xact_setup()</function></link>.
If that's done replication progress will persist in a crash safe
manner. Replay progress for all replication origins can be seen in the
<link linkend="catalog-pg-replication-origin-status">
<structname>pg_replication_origin_status</structname>
</link> view. An individual origin's progress, e.g. when resuming
replication, can be acquired using
<link linkend="pg-replication-origin-progress"><function>pg_replication_origin_progress()</function></link>
for any origin or
<link linkend="pg-replication-origin-session-progress"><function>pg_replication_origin_session_progress()</function></link>
for the origin configured in the current session.
</para>
<para>
In more complex replication topologies than replication from exactly one
system to one other, another problem can be that it is hard to avoid
replicating replayed rows again. That can lead both to cycles in the
replication and inefficiencies. Replication origins provide an optional
mechanism to recognize and prevent that. When configured using the functions
referenced in the previous paragraph, every change and transaction passed to
output plugin callbacks (see <xref linkend="logicaldecoding-output-plugin">)
generated by the session is tagged with the replication origin of the
generating session. This allows to treat them differently in the output
plugin, e.g. ignoring all but locally originating rows. Additionally
the <link linkend="logicaldecoding-output-plugin-filter-origin">
<function>filter_by_origin_cb</function></link> callback can be used
to filter the logical decoding change stream based on the
source. While less flexible, filtering via that callback is
considerably more efficient.
</para>
</chapter>
|