doc/src/sgml/lo.sgml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118

<sect1 id="lo">
 <title>lo</title>
 
 <indexterm zone="lo">
  <primary>lo</primary>
 </indexterm>

 <para>
  PostgreSQL type extension for managing Large Objects
 </para>

 <sect2>
  <title>Overview</title>
  <para>
   One of the problems with the JDBC driver (and this affects the ODBC driver
   also), is that the specification assumes that references to BLOBS (Binary
   Large OBjectS) are stored within a table, and if that entry is changed, the
   associated BLOB is deleted from the database.
  </para>
  <para>
   As PostgreSQL stands, this doesn't occur.  Large objects are treated as
   objects in their own right; a table entry can reference a large object by
   OID, but there can be multiple table entries referencing the same large
   object OID, so the system doesn't delete the large object just because you
   change or remove one such entry.
  </para>
  <para>
   Now this is fine for new PostgreSQL-specific applications, but existing ones
   using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
   that are not referenced by anything, and simply occupy disk space.
  </para>
 </sect2>

 <sect2>
  <title>The Fix</title>
  <para>
   I've fixed this by creating a new data type 'lo', some support functions, and
   a Trigger which handles the orphaning problem.  The trigger essentially just
   does a 'lo_unlink' whenever you delete or modify a value referencing a large
   object.  When you use this trigger, you are assuming that there is only one
   database reference to any large object that is referenced in a
   trigger-controlled column!
  </para>
  <para>
   The 'lo' type was created because we needed to differentiate between plain
   OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
   but (after talking to Byron), the ODBC driver needed a unique type. They had
   created an 'lo' type, but not the solution to orphaning.
  </para>
  <para>
   You don't actually have to use the 'lo' type to use the trigger, but it may be
   convenient to use it to keep track of which columns in your database represent
   large objects that you are managing with the trigger.
  </para>
 </sect2>

 <sect2>
  <title>How to Use</title>
  <para>
   The easiest way is by an example:
  </para>
  <programlisting>
   CREATE TABLE image (title TEXT, raster lo);
   CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image
     FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster);
  </programlisting>
  <para>
   Create a trigger for each column that contains a lo type, and give the column
   name as the trigger procedure argument.  You can have more than one trigger on
   a table if you need multiple lo columns in the same table, but don't forget to
   give a different name to each trigger.
  </para>
 </sect2>

 <sect2>
  <title>Issues</title>

  <itemizedlist>
   <listitem>
    <para>
     Dropping a table will still orphan any objects it contains, as the trigger
     is not executed.
    </para>
    <para>
     Avoid this by preceding the 'drop table' with 'delete from {table}'.
    </para>
    <para>
     If you already have, or suspect you have, orphaned large objects, see
     the contrib/vacuumlo module to help you clean them up.  It's a good idea
     to run contrib/vacuumlo occasionally as a back-stop to the lo_manage
     trigger.
    </para>
   </listitem>
   <listitem>
    <para>
     Some frontends may create their own tables, and will not create the
     associated trigger(s). Also, users may not remember (or know) to create
     the triggers.
    </para>
   </listitem>
  </itemizedlist>

  <para>
   As the ODBC driver needs a permanent lo type (& JDBC could be optimised to
   use it if it's Oid is fixed), and as the above issues can only be fixed by
   some internal changes, I feel it should become a permanent built-in type.
  </para>
 </sect2>

 <sect2>
  <title>Author</title>
  <para>
   Peter Mount <email>peter@retep.org.uk</email> June 13 1998
  </para>
 </sect2>
</sect1>