aboutsummaryrefslogtreecommitdiff
path: root/doc/src/sgml/extend.sgml
blob: 438a8ec8795a9d30014243e657b0b80c2e3fb397 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.18 2002/11/03 01:31:32 momjian Exp $
-->

 <chapter id="extend">
  <title>Extending <acronym>SQL</acronym>: An Overview</title>

   <indexterm zone="extend">
    <primary>extending SQL</primary>
   </indexterm>

  <para>
   In  the  sections  that follow, we will discuss how you
   can extend the <productname>PostgreSQL</productname> 
   <acronym>SQL</acronym> query language by adding:

   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <para>
      functions
     </para>
    </listitem>
    <listitem>
     <para>
      data types
     </para>
    </listitem>
    <listitem>
     <para>
      operators
     </para>
    </listitem>
    <listitem>
     <para>
      aggregates
     </para>
    </listitem>
   </itemizedlist>
  </para>

  <sect1 id="extend-how">
   <title>How Extensibility Works</title>

   <para>
    <productname>PostgreSQL</productname> is extensible because its operation  is  
    catalog-driven.   If  you  are familiar with standard 
    relational systems, you know that  they  store  information
    about  databases,  tables,  columns,  etc., in what are
    commonly known as system catalogs.  (Some systems  call
    this  the data dictionary).  The catalogs appear to the
    user as tables like any other, but  the  <acronym>DBMS</acronym>  stores
    its  internal  bookkeeping in them.  One key difference
    between <productname>PostgreSQL</productname> and  standard  relational  systems  is
    that <productname>PostgreSQL</productname> stores much more information in its 
    catalogs -- not only information about tables and  columns,
    but also information about its types, functions, access
    methods, and so on.  These tables can be  modified  by
    the  user, and since <productname>PostgreSQL</productname> bases its internal operation 
    on these tables, this means that <productname>PostgreSQL</productname> can  be
    extended   by   users.    By  comparison,  conventional
    database systems can only be extended by changing hardcoded  
    procedures within the <acronym>DBMS</acronym> or by loading modules
    specially written by the <acronym>DBMS</acronym> vendor.
   </para>

   <para>
    <productname>PostgreSQL</productname> is also unlike most  other  data  managers  in
    that  the server can incorporate user-written code into
    itself through dynamic loading.  That is, the user  can
    specify  an  object code file (e.g., a shared library) that implements a new type or  function 
    and <productname>PostgreSQL</productname> will load it as required.  Code written 
    in <acronym>SQL</acronym> is even more trivial to add to the  server.
    This ability to modify its operation <quote>on the fly</quote> makes
    <productname>PostgreSQL</productname> uniquely suited for rapid prototyping  of  new
    applications and storage structures.
   </para>
  </sect1>

  <sect1 id="type-system">
   <title>The <productname>PostgreSQL</productname> Type System</title>

   <indexterm zone="type-system">
    <primary>extending SQL</primary>
    <secondary>types</secondary>
   </indexterm>

   <indexterm zone="type-system">
    <primary>data types</primary>
   </indexterm>

   <para>
    The <productname>PostgreSQL</productname> type system
    can be broken down in several ways.
    Types are divided into base types and composite  types.
    Base  types  are those, like <type>int4</type>, that are implemented
    in a language such as C.  They generally correspond  to
    what are often known as <firstterm>abstract data types</firstterm>; <productname>PostgreSQL</productname>
    can only operate on such types through methods provided
    by  the  user and only understands the behavior of such
    types to the extent that the user describes them.  
    Composite  types  are  created whenever the user creates a
    table.
   </para>

   <para>
    <productname>PostgreSQL</productname>  stores  these  types
    in only one way (within the
    file that stores all rows of a table)  but  the
    user can <quote>look inside</quote> at the attributes of these types
    from the query language and optimize their retrieval by
    (for example) defining indexes on the attributes.
    <productname>PostgreSQL</productname>  base  types are further
    divided into built-in
    types and user-defined  types.   Built-in  types  (like
    <type>int4</type>)  are  those  that  are  compiled
    into the system.
    User-defined types are those created by the user in the
    manner to be described later.
   </para>
  </sect1>

  <sect1 id="pg-system-catalogs">
   <title>About the <productname>PostgreSQL</productname> System Catalogs</title>

   <indexterm zone="pg-system-catalogs">
    <primary>catalogs</primary>
   </indexterm>

   <para>
    Having  introduced the basic extensibility concepts, we
    can now take a look at how the  catalogs  are  actually
    laid  out.  You can skip this section for now, but some
    later sections will  be  incomprehensible  without  the
    information  given  here,  so  mark this page for later
    reference.
    All system catalogs have names  that  begin  with
    <literal>pg_</literal>.
    The  following  tables contain information that may be
    useful to the end user.  (There are many  other  system
    catalogs,  but there should rarely be a reason to query
    them directly.)

    <table tocentry="1">
     <title>PostgreSQL System Catalogs</title>
     <titleabbrev>Catalogs</titleabbrev>
     <tgroup cols="2">
      <thead>
       <row>
	<entry>Catalog Name</entry>
	<entry>Description</entry>
       </row>
      </thead>
      <tbody>
       <row>
	<entry><structname>pg_database</></entry>
	<entry> databases</entry>
       </row>
       <row>
	<entry><structname>pg_class</></entry>
	<entry> tables</entry>
       </row>
       <row>
	<entry><structname>pg_attribute</></entry>
	<entry> table columns</entry>
       </row>
       <row>
	<entry><structname>pg_index</></entry>
	<entry> indexes</entry>
       </row>
       <row>
	<entry><structname>pg_proc</></entry>
	<entry> procedures/functions </entry>
       </row>
       <row>
	<entry><structname>pg_type</></entry>
	<entry> data types (both base and complex)</entry>
       </row>
       <row>
	<entry><structname>pg_operator</></entry>
	<entry> operators</entry>
       </row>
       <row>
	<entry><structname>pg_aggregate</></entry>
	<entry> aggregate functions</entry>
       </row>
       <row>
	<entry><structname>pg_am</></entry>
	<entry> access methods</entry>
       </row>
       <row>
	<entry><structname>pg_amop</></entry>
	<entry> access method operators</entry>
       </row>
       <row>
	<entry><structname>pg_amproc</></entry>
	<entry> access method support functions</entry>
       </row>
       <row>
	<entry><structname>pg_opclass</></entry>
	<entry> access method operator classes</entry>
       </row>
      </tbody>
     </tgroup>
    </table>
   </para>

   <para>
    <figure float="1" id="EXTEND-CATALOGS">
     <title>The major <productname>PostgreSQL</productname> system catalogs</title>
     <mediaobject>
      <imageobject>
       <imagedata fileref="catalogs" align="center">
      </imageobject>
     </mediaobject>
    </figure>

    The <citetitle>Developer's Guide</citetitle> gives a more detailed  explanation
    of  these catalogs and their columns.  However,
    <xref linkend="EXTEND-CATALOGS">
    shows the major entities and their  relationships
    in  the system catalogs.  (Columns that do not refer
    to other entities are not shown unless they are part of
    a primary key.)
    This diagram is more or less incomprehensible until you
    actually start looking at the contents of the  catalogs
    and  see  how  they relate to each other.  For now, the
    main things to take away from this diagram are as  follows:
     
    <itemizedlist spacing="compact" mark="bullet">
     <listitem>
      <para>
       In  several of the sections that follow, we will
       present various join queries on the system 
       catalogs  that display information we need to extend
       the system.  Looking at this diagram should make
       some  of  these  join  queries  (which are often
       three- or four-way joins)  more  understandable,
       because  you  will  be  able  to  see  that  the
       columns used in the queries form foreign keys
       in other tables.
      </para>
     </listitem>
     <listitem>
      <para>
       Many  different  features  (tables, columns,
       functions,  types,  access  methods,  etc.)  are
       tightly  integrated  in  this  schema.  A simple
       create command may modify many  of  these  catalogs.
      </para>
     </listitem>
     <listitem>
      <para>
       Types and procedures
       are central to the schema.

       <note>
	<para>
	 We  use  the words <firstterm>procedure</firstterm>
	 and <firstterm>function</firstterm> more or less interchangeably.
	</para>
       </note>

       Nearly  every catalog contains some reference to
       rows in one or both of these tables.   For
       example,  <productname>PostgreSQL</productname>  frequently  uses type 
       signatures (e.g.,  of  functions  and  operators)  to
       identify unique rows of other catalogs.
      </para>
     </listitem>
     <listitem>
      <para>
       There are many columns and relationships that
       have obvious meanings, but there are many  
       (particularly  those  that  have  to  do with access
       methods) that do not.
      </para>
     </listitem>
    </itemizedlist>
   </para>
  </sect1>
 </chapter>

<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
sgml-local-catalogs:("/usr/lib/sgml/catalog")
sgml-local-ecat-files:nil
End:
-->