doc/src/sgml/custom-scan.sgml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325

<!-- doc/src/sgml/custom-scan.sgml -->

<chapter id="custom-scan">
 <title>Writing A Custom Scan Provider</title>

 <indexterm zone="custom-scan">
  <primary>custom scan provider</primary>
  <secondary>handler for</secondary>
 </indexterm>

 <para>
  <productname>PostgreSQL</> supports a set of experimental facilities which
  are intended to allow extension modules to add new scan types to the system.
  Unlike a <link linkend="fdwhandler">foreign data wrapper</>, which is only
  responsible for knowing how to scan its own foreign tables, a custom scan
  provider can provide an alternative method of scanning any relation in the
  system.  Typically, the motivation for writing a custom scan provider will
  be to allow the use of some optimization not supported by the core
  system, such as caching or some form of hardware acceleration.  This chapter
  outlines how to write a new custom scan provider.
 </para>

 <para>
  Implementing a new type of custom scan is a three-step process.  First,
  during planning, it is necessary to generate access paths representing a
  scan using the proposed strategy.  Second, if one of those access paths
  is selected by the planner as the optimal strategy for scanning a
  particular relation, the access path must be converted to a plan.
  Finally, it must be possible to execute the plan and generate the same
  results that would have been generated for any other access path targeting
  the same relation.
 </para>

 <sect1 id="custom-scan-path">
  <title>Implementing Custom Paths</title>

  <para>
    A custom scan provider will typically add paths by setting the following
    hook, which is called after the core code has generated what it believes
    to be the complete and correct set of access paths for the relation.
<programlisting>
typedef void (*set_rel_pathlist_hook_type) (PlannerInfo *root,
                                            RelOptInfo *rel,
                                            Index rti,
                                            RangeTblEntry *rte);
extern PGDLLIMPORT set_rel_pathlist_hook_type set_rel_pathlist_hook;
</programlisting>
  </para>

  <para>
    Although this hook function can be used to examine, modify, or remove
    paths generated by the core system, a custom scan provider will typically
    confine itself to generating <structname>CustomPath</> objects and adding
    them to <literal>rel</> using <function>add_path</>.  The custom scan
    provider is responsible for initializing the <structname>CustomPath</>
    object, which is declared like this:
<programlisting>
typedef struct CustomPath
{
    Path      path;
    uint32    flags;
    List     *custom_private;
    const CustomPathMethods *methods;
} CustomPath;
</programlisting>
  </para>

  <para>
    <structfield>path</> must be initialized as for any other path, including
    the row-count estimate, start and total cost, and sort ordering provided
    by this path.  <structfield>flags</> is a bitmask, which should include
    <literal>CUSTOMPATH_SUPPORT_BACKWARD_SCAN</> if the custom path can support
    a backward scan and <literal>CUSTOMPATH_SUPPORT_MARK_RESTORE</> if it
    can support mark and restore.  Both capabilities are optional.
    <structfield>custom_private</> can be used to store the custom path's
    private data.  Private data should be stored in a form that can be handled
    by <literal>nodeToString</>, so that debugging routines which attempt to
    print the custom path will work as designed.  <structfield>methods</> must
    point to a (usually statically allocated) object implementing the required
    custom path methods, of which there are currently only two, as further
    detailed below.
  </para>

  <para>
   A custom scan provider can also add join paths; in this case, the scan
   must produce the same output as would normally be produced by the join
   it replaces.  To do this, the join provider should set the following hook.
   This hook may be invoked repeatedly for the same pair of relations, with
   different combinations of inner and outer relations; it is the
   responsibility of the hook to minimize duplicated work.
<programlisting>
typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
                                             RelOptInfo *joinrel,
                                             RelOptInfo *outerrel,
                                             RelOptInfo *innerrel,
                                             List *restrictlist,
                                             JoinType jointype,
                                             SpecialJoinInfo *sjinfo,
                                             SemiAntiJoinFactors *semifactors,
                                             Relids param_source_rels,
                                             Relids extra_lateral_rels);
extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
</programlisting>
  </para>

  <sect2 id="custom-scan-path-callbacks">
  <title>Custom Path Callbacks</title>

  <para>
<programlisting>
Plan *(*PlanCustomPath) (PlannerInfo *root,
                         RelOptInfo *rel,
                         CustomPath *best_path,
                         List *tlist,
                         List *clauses);
</programlisting>
    Convert a custom path to a finished plan.  The return value will generally
    be a <literal>CustomScan</> object, which the callback must allocate and
    initialize.  See <xref linkend="custom-scan-plan"> for more details.
   </para>

   <para>
<programlisting>
void (*TextOutCustomPath) (StringInfo str,
                           const CustomPath *node);
</programlisting>
    Generate additional output when <function>nodeToString</> is invoked on
    this custom path.  This callback is optional. Since
    <function>nodeToString</> will automatically dump all fields in the
    structure that it can see, including <structfield>custom_private</>, this
    is only useful if the <structname>CustomPath</> is actually embedded in a
    larger struct containing additional fields.
   </para>
  </sect2>
 </sect1>

 <sect1 id="custom-scan-plan">
  <title>Implementing Custom Plans</title>

  <para>
    A custom scan is represented in a finished plan tree using the following
    structure:
<programlisting>
typedef struct CustomScan
{
    Scan      scan;
    uint32    flags;
    List     *custom_exprs;
    List     *custom_ps_tlist;
    List     *custom_private;
    List     *custom_relids;
    const CustomScanMethods *methods;
} CustomScan;
</programlisting>
  </para>

  <para>
    <structfield>scan</> must be initialized as for any other scan, including
    estimated costs, target lists, qualifications, and so on.
    <structfield>flags</> is a bitmask with the same meaning as in
    <structname>CustomPath</>.  <structfield>custom_exprs</> should be used to
    store expression trees that will need to be fixed up by
    <filename>setrefs.c</> and <filename>subselect.c</>, while
    <literal>custom_private</> should be used to store other private data that
    is only used by the custom scan provider itself.  Plan trees must be able
    to be duplicated using <function>copyObject</>, so all the data stored
    within these two fields must consist of nodes that function can handle.
    <literal>custom_relids</> is set by the core code to the set of relations
    which this scan node must handle; except when this scan is replacing a
    join, it will have only one member.
    <structfield>methods</> must point to a (usually statically allocated)
    object implementing the required custom scan methods, which are further
    detailed below.
  </para>

  <para>
   When a <structname>CustomScan</> scans a single relation,
   <structfield>scan.scanrelid</> should be the range table index of the table
   to be scanned, and <structfield>custom_ps_tlist</> should be
   <literal>NULL</>.  When it replaces a join, <structfield>scan.scanrelid</>
   should be zero, and <structfield>custom_ps_tlist</> should be a list of
   <structname>TargetEntry</> nodes.  This is necessary because, when a join
   is replaced, the target list cannot be constructed from the table
   definition.  At execution time, this list will be used to initialize the
   tuple descriptor of the <structname>TupleTableSlot</>.  It will also be
   used by <command>EXPLAIN</>, when deparsing.
  </para>

  <sect2 id="custom-scan-plan-callbacks">
   <title>Custom Scan Callbacks</title>
   <para>
<programlisting>
Node *(*CreateCustomScanState) (CustomScan *cscan);
</programlisting>
    Allocate a <structname>CustomScanState</> for this
    <structname>CustomScan</>.  The actual allocation will often be larger than
    required for an ordinary <structname>CustomScanState</>, because many
    scan types will wish to embed that as the first field of a large structure.
    The value returned must have the node tag and <structfield>methods</>
    set appropriately, but the other fields need not be initialized at this
    stage; after <function>ExecInitCustomScan</> performs basic initialization,
    the <function>BeginCustomScan</> callback will be invoked to give the
    custom scan state a chance to do whatever else is needed.
   </para>

   <para>
<programlisting>
void (*TextOutCustomScan) (StringInfo str,
                           const CustomScan *node);
</programlisting>
    Generate additional output when <function>nodeToString</> is invoked on
    this custom plan.  This callback is optional.  Since a
    <structname>CustomScan</> must be copyable by <function>copyObject</>,
    custom scan providers cannot substitute a larger structure that embeds a
    <structname>CustomScan</> for the structure itself, as would be possible
    for a <structname>CustomPath</> or <structname>CustomScanState</>.
    Therefore, providing this callback is unlikely to be useful.
   </para>
  </sect2>
 </sect1>

 <sect1 id="custom-scan-scan">
  <title>Implementing Custom Scans</title>

  <para>
   When a <structfield>CustomScan</> is executed, its execution state is
   represented by a <structfield>CustomScanState</>, which is declared as
   follows.
<programlisting>
typedef struct CustomScanState
{
    ScanState ss;
    uint32    flags;
    const CustomExecMethods *methods;
} CustomScanState;
</programlisting>
  </para>

  <para>
   <structfield>ss</> must be initialized as for any other scanstate;
   <structfield>flags</> is a bitmask with the same meaning as in
   <structname>CustomPath</> and <structname>CustomScan</>.
   <structfield>methods</> must point to a (usually statically allocated)
   object implementing the required custom scan state methods, which are
   further detailed below.  Typically, a <structname>CustomScanState</>, which
   need not support <function>copyObject</>, will actually be a larger
   structure embedding the above as its first member.
  </para>

  <sect2 id="custom-scan-scan-callbacks">
   <title>Custom Execution-Time Callbacks</title>

   <para>
<programlisting>
void (*BeginCustomScan) (CustomScanState *node,
                         EState *estate,
                         int eflags);
</programlisting>
    Complete initialization of the supplied <structname>CustomScanState</>.
    Some initialization is performed by <function>ExecInitCustomScan</>, but
    any private fields should be initialized here.
   </para>

   <para>
<programlisting>
TupleTableSlot *(*ExecCustomScan) (CustomScanState *node);
</programlisting>
    Fetch the next scan tuple.  If any tuples remain, it should fill
    <literal>ps_ResultTupleSlot</> with the next tuple in the current scan
    direction, and then return the tuple slot.  If not,
    <literal>NULL</> or an empty slot should be returned.
   </para>

   <para>
<programlisting>
void (*EndCustomScan) (CustomScanState *node);
</programlisting>
    Clean up any private data associated with the <literal>CustomScanState</>.
    This method is required, but may not need to do anything if the associated
    data does not exist or will be cleaned up automatically.
   </para>

   <para>
<programlisting>
void (*ReScanCustomScan) (CustomScanState *node);
</programlisting>
    Rewind the current scan to the beginning and prepare to rescan the
    relation.
   </para>

   <para>
<programlisting>
void (*MarkPosCustomScan) (CustomScanState *node);
</programlisting>
    Save the current scan position so that it can subsequently be restored
    by the <function>RestrPosCustomScan</> callback.  This calback is optional,
    and need only be supplied if 
    <literal>CUSTOMPATH_SUPPORT_MARK_RESTORE</> flag is set.
   </para>

   <para>
<programlisting>
void (*RestrPosCustomScan) (CustomScanState *node);
</programlisting>
    Restore the previous scan position as saved by the
    <function>MarkPosCustomScan</> callback.  This callback is optional,
    and need only be supplied if 
    <literal>CUSTOMPATH_SUPPORT_MARK_RESTORE</> flag is set.
   </para>

   <para>
<programlisting>
void (*ExplainCustomScan) (CustomScanState *node,
                           List *ancestors,
                           ExplainState *es);
</programlisting>
    Output additional information on <command>EXPLAIN</> that involves
    custom-scan node.  This callback is optional.  Common data stored in the
    <structname>ScanState</>, such as the target list and scan relation, will
    be shown even without this callback, but the callback allows the display
    of additional, private state.
   </para>
  </sect2>
 </sect1>
</chapter>