diff options
Diffstat (limited to 'doc/src/sgml/storage.sgml')
-rw-r--r-- | doc/src/sgml/storage.sgml | 57 |
1 files changed, 31 insertions, 26 deletions
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index 1973a5b90c3..9c3cf7589da 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/storage.sgml,v 1.16 2007/04/03 04:14:26 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/storage.sgml,v 1.17 2007/04/06 04:21:41 tgl Exp $ --> <chapter id="storage"> @@ -210,18 +210,27 @@ value, but in some cases more efficient approaches are possible.) </para> <para> -<acronym>TOAST</> usurps the high-order two bits of the varlena length word, +<acronym>TOAST</> usurps two bits of the varlena length word (the high-order +bits on big-endian machines, the low-order bits on little-endian machines), thereby limiting the logical size of any value of a <acronym>TOAST</>-able data type to 1 GB (2<superscript>30</> - 1 bytes). When both bits are zero, -the value is an ordinary un-<acronym>TOAST</>ed value of the data type. One -of these bits, if set, indicates that the value has been compressed and must -be decompressed before use. The other bit, if set, indicates that the value -has been stored out-of-line. In this case the remainder of the value is -actually just a pointer, and the correct data has to be found elsewhere. When -both bits are set, the out-of-line data has been compressed too. In each case -the length in the low-order bits of the varlena word indicates the actual size -of the datum, not the size of the logical value that would be extracted by -decompression or fetching of the out-of-line data. +the value is an ordinary un-<acronym>TOAST</>ed value of the data type, and +the remaining bits of the length word give the total datum size (including +length word) in bytes. When the highest-order or lowest-order bit is set, +the value has only a single-byte header instead of the normal four-byte +header, and the remaining bits give the total datum size (including length +byte) in bytes. As a special case, if the remaining bits are all zero +(which would be impossible for a self-inclusive length), the value is a +pointer to out-of-line data stored in a separate TOAST table. (The size of +a TOAST pointer is known a priori, so it doesn't need to be represented in +the header.) Values with single-byte headers aren't aligned on any particular +boundary, either. Lastly, when the highest-order or lowest-order bit is +clear but the adjacent bit is set, the content of the datum has been +compressed and must be decompressed before use. In this case the remaining +bits of the length word give the total size of the compressed datum, not the +original data. Note that compression is also possible for out-of-line data +but the varlena header does not tell whether it has occurred — +the content of the TOAST pointer tells that, instead. </para> <para> @@ -254,8 +263,8 @@ retrieval of the values. A pointer datum representing an out-of-line <acronym>TOAST</> table in which to look and the OID of the specific value (its <structfield>chunk_id</>). For convenience, pointer datums also store the logical datum size (original uncompressed data length) and actual stored size -(different if compression was applied). Allowing for the varlena header word, -the total size of a <acronym>TOAST</> pointer datum is therefore 20 bytes +(different if compression was applied). Allowing for the varlena header byte, +the total size of a <acronym>TOAST</> pointer datum is therefore 17 bytes regardless of the actual size of the represented value. </para> @@ -280,7 +289,9 @@ The <acronym>TOAST</> code recognizes four different strategies for storing <listitem> <para> <literal>PLAIN</literal> prevents either compression or - out-of-line storage. This is the only possible strategy for + out-of-line storage; furthermore it disables use of single-byte headers + for varlena types. + This is the only possible strategy for columns of non-<acronym>TOAST</>-able data types. </para> </listitem> @@ -562,7 +573,7 @@ data. Empty in ordinary tables.</entry> <para> All table rows are structured in the same way. There is a fixed-size - header (occupying 27 bytes on most machines), followed by an optional null + header (occupying 23 bytes on most machines), followed by an optional null bitmap, an optional object ID field, and the user data. The header is detailed in <xref linkend="heaptupleheaderdata-table">. The actual user data @@ -605,22 +616,16 @@ data. Empty in ordinary tables.</entry> <entry>insert XID stamp</entry> </row> <row> - <entry>t_cmin</entry> - <entry>CommandId</entry> - <entry>4 bytes</entry> - <entry>insert CID stamp</entry> - </row> - <row> <entry>t_xmax</entry> <entry>TransactionId</entry> <entry>4 bytes</entry> <entry>delete XID stamp</entry> </row> <row> - <entry>t_cmax</entry> + <entry>t_cid</entry> <entry>CommandId</entry> <entry>4 bytes</entry> - <entry>delete CID stamp (overlays with t_xvac)</entry> + <entry>insert and/or delete CID stamp (overlays with t_xvac)</entry> </row> <row> <entry>t_xvac</entry> @@ -635,10 +640,10 @@ data. Empty in ordinary tables.</entry> <entry>current TID of this or newer row version</entry> </row> <row> - <entry>t_natts</entry> + <entry>t_infomask2</entry> <entry>int16</entry> <entry>2 bytes</entry> - <entry>number of attributes</entry> + <entry>number of attributes, plus various flag bits</entry> </row> <row> <entry>t_infomask</entry> @@ -682,7 +687,7 @@ data. Empty in ordinary tables.</entry> fixed width field, then all the bytes are simply placed. If it's a variable length field (attlen = -1) then it's a bit more complicated. All variable-length datatypes share the common header structure - <type>varattrib</type>, which includes the total length of the stored + <type>struct varlena</type>, which includes the total length of the stored value and some flag bits. Depending on the flags, the data can be either inline or in a <acronym>TOAST</> table; it might be compressed, too (see <xref linkend="storage-toast">). |