diff options
Diffstat (limited to 'doc/src/sgml/func/func-xml.sgml')
-rw-r--r-- | doc/src/sgml/func/func-xml.sgml | 1283 |
1 files changed, 1283 insertions, 0 deletions
diff --git a/doc/src/sgml/func/func-xml.sgml b/doc/src/sgml/func/func-xml.sgml new file mode 100644 index 00000000000..21f34467a4f --- /dev/null +++ b/doc/src/sgml/func/func-xml.sgml @@ -0,0 +1,1283 @@ + <sect1 id="functions-xml"> + + <title>XML Functions</title> + + <indexterm> + <primary>XML Functions</primary> + </indexterm> + + <para> + The functions and function-like expressions described in this + section operate on values of type <type>xml</type>. See <xref + linkend="datatype-xml"/> for information about the <type>xml</type> + type. The function-like expressions <function>xmlparse</function> + and <function>xmlserialize</function> for converting to and from + type <type>xml</type> are documented there, not in this section. + </para> + + <para> + Use of most of these functions + requires <productname>PostgreSQL</productname> to have been built + with <command>configure --with-libxml</command>. + </para> + + <sect2 id="functions-producing-xml"> + <title>Producing XML Content</title> + + <para> + A set of functions and function-like expressions is available for + producing XML content from SQL data. As such, they are + particularly suitable for formatting query results into XML + documents for processing in client applications. + </para> + + <sect3 id="functions-producing-xml-xmltext"> + <title><literal>xmltext</literal></title> + + <indexterm> + <primary>xmltext</primary> + </indexterm> + +<synopsis> +<function>xmltext</function> ( <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The function <function>xmltext</function> returns an XML value with a single + text node containing the input argument as its content. Predefined entities + like ampersand (<literal><![CDATA[&]]></literal>), left and right angle brackets + (<literal><![CDATA[< >]]></literal>), and quotation marks (<literal><![CDATA[""]]></literal>) + are escaped. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xmltext('< foo & bar >'); + xmltext +------------------------- + < foo & bar > +]]></screen> + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlcomment"> + <title><literal>xmlcomment</literal></title> + + <indexterm> + <primary>xmlcomment</primary> + </indexterm> + +<synopsis> +<function>xmlcomment</function> ( <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The function <function>xmlcomment</function> creates an XML value + containing an XML comment with the specified text as content. + The text cannot contain <quote><literal>--</literal></quote> or end with a + <quote><literal>-</literal></quote>, otherwise the resulting construct + would not be a valid XML comment. + If the argument is null, the result is null. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xmlcomment('hello'); + + xmlcomment +-------------- + <!--hello--> +]]></screen> + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlconcat"> + <title><literal>xmlconcat</literal></title> + + <indexterm> + <primary>xmlconcat</primary> + </indexterm> + +<synopsis> +<function>xmlconcat</function> ( <type>xml</type> <optional>, ...</optional> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The function <function>xmlconcat</function> concatenates a list + of individual XML values to create a single value containing an + XML content fragment. Null values are omitted; the result is + only null if there are no nonnull arguments. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xmlconcat('<abc/>', '<bar>foo</bar>'); + + xmlconcat +---------------------- + <abc/><bar>foo</bar> +]]></screen> + </para> + + <para> + XML declarations, if present, are combined as follows. If all + argument values have the same XML version declaration, that + version is used in the result, else no version is used. If all + argument values have the standalone declaration value + <quote>yes</quote>, then that value is used in the result. If + all argument values have a standalone declaration value and at + least one is <quote>no</quote>, then that is used in the result. + Else the result will have no standalone declaration. If the + result is determined to require a standalone declaration but no + version declaration, a version declaration with version 1.0 will + be used because XML requires an XML declaration to contain a + version declaration. Encoding declarations are ignored and + removed in all cases. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xmlconcat('<?xml version="1.1"?><foo/>', '<?xml version="1.1" standalone="no"?><bar/>'); + + xmlconcat +----------------------------------- + <?xml version="1.1"?><foo/><bar/> +]]></screen> + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlelement"> + <title><literal>xmlelement</literal></title> + + <indexterm> + <primary>xmlelement</primary> + </indexterm> + +<synopsis> +<function>xmlelement</function> ( <literal>NAME</literal> <replaceable>name</replaceable> <optional>, <literal>XMLATTRIBUTES</literal> ( <replaceable>attvalue</replaceable> <optional> <literal>AS</literal> <replaceable>attname</replaceable> </optional> <optional>, ...</optional> ) </optional> <optional>, <replaceable>content</replaceable> <optional>, ...</optional></optional> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The <function>xmlelement</function> expression produces an XML + element with the given name, attributes, and content. + The <replaceable>name</replaceable> + and <replaceable>attname</replaceable> items shown in the syntax are + simple identifiers, not values. The <replaceable>attvalue</replaceable> + and <replaceable>content</replaceable> items are expressions, which can + yield any <productname>PostgreSQL</productname> data type. The + argument(s) within <literal>XMLATTRIBUTES</literal> generate attributes + of the XML element; the <replaceable>content</replaceable> value(s) are + concatenated to form its content. + </para> + + <para> + Examples: +<screen><![CDATA[ +SELECT xmlelement(name foo); + + xmlelement +------------ + <foo/> + +SELECT xmlelement(name foo, xmlattributes('xyz' as bar)); + + xmlelement +------------------ + <foo bar="xyz"/> + +SELECT xmlelement(name foo, xmlattributes(current_date as bar), 'cont', 'ent'); + + xmlelement +------------------------------------- + <foo bar="2007-01-26">content</foo> +]]></screen> + </para> + + <para> + Element and attribute names that are not valid XML names are + escaped by replacing the offending characters by the sequence + <literal>_x<replaceable>HHHH</replaceable>_</literal>, where + <replaceable>HHHH</replaceable> is the character's Unicode + codepoint in hexadecimal notation. For example: +<screen><![CDATA[ +SELECT xmlelement(name "foo$bar", xmlattributes('xyz' as "a&b")); + + xmlelement +---------------------------------- + <foo_x0024_bar a_x0026_b="xyz"/> +]]></screen> + </para> + + <para> + An explicit attribute name need not be specified if the attribute + value is a column reference, in which case the column's name will + be used as the attribute name by default. In other cases, the + attribute must be given an explicit name. So this example is + valid: +<screen> +CREATE TABLE test (a xml, b xml); +SELECT xmlelement(name test, xmlattributes(a, b)) FROM test; +</screen> + But these are not: +<screen> +SELECT xmlelement(name test, xmlattributes('constant'), a, b) FROM test; +SELECT xmlelement(name test, xmlattributes(func(a, b))) FROM test; +</screen> + </para> + + <para> + Element content, if specified, will be formatted according to + its data type. If the content is itself of type <type>xml</type>, + complex XML documents can be constructed. For example: +<screen><![CDATA[ +SELECT xmlelement(name foo, xmlattributes('xyz' as bar), + xmlelement(name abc), + xmlcomment('test'), + xmlelement(name xyz)); + + xmlelement +---------------------------------------------- + <foo bar="xyz"><abc/><!--test--><xyz/></foo> +]]></screen> + + Content of other types will be formatted into valid XML character + data. This means in particular that the characters <, >, + and & will be converted to entities. Binary data (data type + <type>bytea</type>) will be represented in base64 or hex + encoding, depending on the setting of the configuration parameter + <xref linkend="guc-xmlbinary"/>. The particular behavior for + individual data types is expected to evolve in order to align the + PostgreSQL mappings with those specified in SQL:2006 and later, + as discussed in <xref linkend="functions-xml-limits-casts"/>. + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlforest"> + <title><literal>xmlforest</literal></title> + + <indexterm> + <primary>xmlforest</primary> + </indexterm> + +<synopsis> +<function>xmlforest</function> ( <replaceable>content</replaceable> <optional> <literal>AS</literal> <replaceable>name</replaceable> </optional> <optional>, ...</optional> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The <function>xmlforest</function> expression produces an XML + forest (sequence) of elements using the given names and content. + As for <function>xmlelement</function>, + each <replaceable>name</replaceable> must be a simple identifier, while + the <replaceable>content</replaceable> expressions can have any data + type. + </para> + + <para> + Examples: +<screen> +SELECT xmlforest('abc' AS foo, 123 AS bar); + + xmlforest +------------------------------ + <foo>abc</foo><bar>123</bar> + + +SELECT xmlforest(table_name, column_name) +FROM information_schema.columns +WHERE table_schema = 'pg_catalog'; + + xmlforest +------------------------------------&zwsp;----------------------------------- + <table_name>pg_authid</table_name>&zwsp;<column_name>rolname</column_name> + <table_name>pg_authid</table_name>&zwsp;<column_name>rolsuper</column_name> + ... +</screen> + + As seen in the second example, the element name can be omitted if + the content value is a column reference, in which case the column + name is used by default. Otherwise, a name must be specified. + </para> + + <para> + Element names that are not valid XML names are escaped as shown + for <function>xmlelement</function> above. Similarly, content + data is escaped to make valid XML content, unless it is already + of type <type>xml</type>. + </para> + + <para> + Note that XML forests are not valid XML documents if they consist + of more than one element, so it might be useful to wrap + <function>xmlforest</function> expressions in + <function>xmlelement</function>. + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlpi"> + <title><literal>xmlpi</literal></title> + + <indexterm> + <primary>xmlpi</primary> + </indexterm> + +<synopsis> +<function>xmlpi</function> ( <literal>NAME</literal> <replaceable>name</replaceable> <optional>, <replaceable>content</replaceable> </optional> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The <function>xmlpi</function> expression creates an XML + processing instruction. + As for <function>xmlelement</function>, + the <replaceable>name</replaceable> must be a simple identifier, while + the <replaceable>content</replaceable> expression can have any data type. + The <replaceable>content</replaceable>, if present, must not contain the + character sequence <literal>?></literal>. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xmlpi(name php, 'echo "hello world";'); + + xmlpi +----------------------------- + <?php echo "hello world";?> +]]></screen> + </para> + </sect3> + + <sect3 id="functions-producing-xml-xmlroot"> + <title><literal>xmlroot</literal></title> + + <indexterm> + <primary>xmlroot</primary> + </indexterm> + +<synopsis> +<function>xmlroot</function> ( <type>xml</type>, <literal>VERSION</literal> {<type>text</type>|<literal>NO VALUE</literal>} <optional>, <literal>STANDALONE</literal> {<literal>YES</literal>|<literal>NO</literal>|<literal>NO VALUE</literal>} </optional> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The <function>xmlroot</function> expression alters the properties + of the root node of an XML value. If a version is specified, + it replaces the value in the root node's version declaration; if a + standalone setting is specified, it replaces the value in the + root node's standalone declaration. + </para> + + <para> +<screen><![CDATA[ +SELECT xmlroot(xmlparse(document '<?xml version="1.1"?><content>abc</content>'), + version '1.0', standalone yes); + + xmlroot +---------------------------------------- + <?xml version="1.0" standalone="yes"?> + <content>abc</content> +]]></screen> + </para> + </sect3> + + <sect3 id="functions-xml-xmlagg"> + <title><literal>xmlagg</literal></title> + + <indexterm> + <primary>xmlagg</primary> + </indexterm> + +<synopsis> +<function>xmlagg</function> ( <type>xml</type> ) <returnvalue>xml</returnvalue> +</synopsis> + + <para> + The function <function>xmlagg</function> is, unlike the other + functions described here, an aggregate function. It concatenates the + input values to the aggregate function call, + much like <function>xmlconcat</function> does, except that concatenation + occurs across rows rather than across expressions in a single row. + See <xref linkend="functions-aggregate"/> for additional information + about aggregate functions. + </para> + + <para> + Example: +<screen><![CDATA[ +CREATE TABLE test (y int, x xml); +INSERT INTO test VALUES (1, '<foo>abc</foo>'); +INSERT INTO test VALUES (2, '<bar/>'); +SELECT xmlagg(x) FROM test; + xmlagg +---------------------- + <foo>abc</foo><bar/> +]]></screen> + </para> + + <para> + To determine the order of the concatenation, an <literal>ORDER BY</literal> + clause may be added to the aggregate call as described in + <xref linkend="syntax-aggregates"/>. For example: + +<screen><![CDATA[ +SELECT xmlagg(x ORDER BY y DESC) FROM test; + xmlagg +---------------------- + <bar/><foo>abc</foo> +]]></screen> + </para> + + <para> + The following non-standard approach used to be recommended + in previous versions, and may still be useful in specific + cases: + +<screen><![CDATA[ +SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab; + xmlagg +---------------------- + <bar/><foo>abc</foo> +]]></screen> + </para> + </sect3> + </sect2> + + <sect2 id="functions-xml-predicates"> + <title>XML Predicates</title> + + <para> + The expressions described in this section check properties + of <type>xml</type> values. + </para> + + <sect3 id="functions-producing-xml-is-document"> + <title><literal>IS DOCUMENT</literal></title> + + <indexterm> + <primary>IS DOCUMENT</primary> + </indexterm> + +<synopsis> +<type>xml</type> <literal>IS DOCUMENT</literal> <returnvalue>boolean</returnvalue> +</synopsis> + + <para> + The expression <literal>IS DOCUMENT</literal> returns true if the + argument XML value is a proper XML document, false if it is not + (that is, it is a content fragment), or null if the argument is + null. See <xref linkend="datatype-xml"/> about the difference + between documents and content fragments. + </para> + </sect3> + + <sect3 id="functions-producing-xml-is-not-document"> + <title><literal>IS NOT DOCUMENT</literal></title> + + <indexterm> + <primary>IS NOT DOCUMENT</primary> + </indexterm> + +<synopsis> +<type>xml</type> <literal>IS NOT DOCUMENT</literal> <returnvalue>boolean</returnvalue> +</synopsis> + + <para> + The expression <literal>IS NOT DOCUMENT</literal> returns false if the + argument XML value is a proper XML document, true if it is not (that is, + it is a content fragment), or null if the argument is null. + </para> + </sect3> + + <sect3 id="xml-exists"> + <title><literal>XMLEXISTS</literal></title> + + <indexterm> + <primary>XMLEXISTS</primary> + </indexterm> + +<synopsis> +<function>XMLEXISTS</function> ( <type>text</type> <literal>PASSING</literal> <optional><literal>BY</literal> {<literal>REF</literal>|<literal>VALUE</literal>}</optional> <type>xml</type> <optional><literal>BY</literal> {<literal>REF</literal>|<literal>VALUE</literal>}</optional> ) <returnvalue>boolean</returnvalue> +</synopsis> + + <para> + The function <function>xmlexists</function> evaluates an XPath 1.0 + expression (the first argument), with the passed XML value as its context + item. The function returns false if the result of that evaluation + yields an empty node-set, true if it yields any other value. The + function returns null if any argument is null. A nonnull value + passed as the context item must be an XML document, not a content + fragment or any non-XML value. + </para> + + <para> + Example: + <screen><![CDATA[ +SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY VALUE '<towns><town>Toronto</town><town>Ottawa</town></towns>'); + + xmlexists +------------ + t +(1 row) +]]></screen> + </para> + + <para> + The <literal>BY REF</literal> and <literal>BY VALUE</literal> clauses + are accepted in <productname>PostgreSQL</productname>, but are ignored, + as discussed in <xref linkend="functions-xml-limits-postgresql"/>. + </para> + + <para> + In the SQL standard, the <function>xmlexists</function> function + evaluates an expression in the XML Query language, + but <productname>PostgreSQL</productname> allows only an XPath 1.0 + expression, as discussed in + <xref linkend="functions-xml-limits-xpath1"/>. + </para> + </sect3> + + <sect3 id="xml-is-well-formed"> + <title><literal>xml_is_well_formed</literal></title> + + <indexterm> + <primary>xml_is_well_formed</primary> + </indexterm> + + <indexterm> + <primary>xml_is_well_formed_document</primary> + </indexterm> + + <indexterm> + <primary>xml_is_well_formed_content</primary> + </indexterm> + +<synopsis> +<function>xml_is_well_formed</function> ( <type>text</type> ) <returnvalue>boolean</returnvalue> +<function>xml_is_well_formed_document</function> ( <type>text</type> ) <returnvalue>boolean</returnvalue> +<function>xml_is_well_formed_content</function> ( <type>text</type> ) <returnvalue>boolean</returnvalue> +</synopsis> + + <para> + These functions check whether a <type>text</type> string represents + well-formed XML, returning a Boolean result. + <function>xml_is_well_formed_document</function> checks for a well-formed + document, while <function>xml_is_well_formed_content</function> checks + for well-formed content. <function>xml_is_well_formed</function> does + the former if the <xref linkend="guc-xmloption"/> configuration + parameter is set to <literal>DOCUMENT</literal>, or the latter if it is set to + <literal>CONTENT</literal>. This means that + <function>xml_is_well_formed</function> is useful for seeing whether + a simple cast to type <type>xml</type> will succeed, whereas the other two + functions are useful for seeing whether the corresponding variants of + <function>XMLPARSE</function> will succeed. + </para> + + <para> + Examples: + +<screen><![CDATA[ +SET xmloption TO DOCUMENT; +SELECT xml_is_well_formed('<>'); + xml_is_well_formed +-------------------- + f +(1 row) + +SELECT xml_is_well_formed('<abc/>'); + xml_is_well_formed +-------------------- + t +(1 row) + +SET xmloption TO CONTENT; +SELECT xml_is_well_formed('abc'); + xml_is_well_formed +-------------------- + t +(1 row) + +SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuff">bar</pg:foo>'); + xml_is_well_formed_document +----------------------------- + t +(1 row) + +SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuff">bar</my:foo>'); + xml_is_well_formed_document +----------------------------- + f +(1 row) +]]></screen> + + The last example shows that the checks include whether + namespaces are correctly matched. + </para> + </sect3> + </sect2> + + <sect2 id="functions-xml-processing"> + <title>Processing XML</title> + + <para> + To process values of data type <type>xml</type>, PostgreSQL offers + the functions <function>xpath</function> and + <function>xpath_exists</function>, which evaluate XPath 1.0 + expressions, and the <function>XMLTABLE</function> + table function. + </para> + + <sect3 id="functions-xml-processing-xpath"> + <title><literal>xpath</literal></title> + + <indexterm> + <primary>XPath</primary> + </indexterm> + +<synopsis> +<function>xpath</function> ( <parameter>xpath</parameter> <type>text</type>, <parameter>xml</parameter> <type>xml</type> <optional>, <parameter>nsarray</parameter> <type>text[]</type> </optional> ) <returnvalue>xml[]</returnvalue> +</synopsis> + + <para> + The function <function>xpath</function> evaluates the XPath 1.0 + expression <parameter>xpath</parameter> (given as text) + against the XML value + <parameter>xml</parameter>. It returns an array of XML values + corresponding to the node-set produced by the XPath expression. + If the XPath expression returns a scalar value rather than a node-set, + a single-element array is returned. + </para> + + <para> + The second argument must be a well formed XML document. In particular, + it must have a single root node element. + </para> + + <para> + The optional third argument of the function is an array of namespace + mappings. This array should be a two-dimensional <type>text</type> array with + the length of the second axis being equal to 2 (i.e., it should be an + array of arrays, each of which consists of exactly 2 elements). + The first element of each array entry is the namespace name (alias), the + second the namespace URI. It is not required that aliases provided in + this array be the same as those being used in the XML document itself (in + other words, both in the XML document and in the <function>xpath</function> + function context, aliases are <emphasis>local</emphasis>). + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', + ARRAY[ARRAY['my', 'http://example.com']]); + + xpath +-------- + {test} +(1 row) +]]></screen> + </para> + + <para> + To deal with default (anonymous) namespaces, do something like this: +<screen><![CDATA[ +SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>', + ARRAY[ARRAY['mydefns', 'http://example.com']]); + + xpath +-------- + {test} +(1 row) +]]></screen> + </para> + </sect3> + + <sect3 id="functions-xml-processing-xpath-exists"> + <title><literal>xpath_exists</literal></title> + + <indexterm> + <primary>xpath_exists</primary> + </indexterm> + +<synopsis> +<function>xpath_exists</function> ( <parameter>xpath</parameter> <type>text</type>, <parameter>xml</parameter> <type>xml</type> <optional>, <parameter>nsarray</parameter> <type>text[]</type> </optional> ) <returnvalue>boolean</returnvalue> +</synopsis> + + <para> + The function <function>xpath_exists</function> is a specialized form + of the <function>xpath</function> function. Instead of returning the + individual XML values that satisfy the XPath 1.0 expression, this function + returns a Boolean indicating whether the query was satisfied or not + (specifically, whether it produced any value other than an empty node-set). + This function is equivalent to the <literal>XMLEXISTS</literal> predicate, + except that it also offers support for a namespace mapping argument. + </para> + + <para> + Example: +<screen><![CDATA[ +SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', + ARRAY[ARRAY['my', 'http://example.com']]); + + xpath_exists +-------------- + t +(1 row) +]]></screen> + </para> + </sect3> + + <sect3 id="functions-xml-processing-xmltable"> + <title><literal>xmltable</literal></title> + + <indexterm> + <primary>xmltable</primary> + </indexterm> + + <indexterm zone="functions-xml-processing-xmltable"> + <primary>table function</primary> + <secondary>XMLTABLE</secondary> + </indexterm> + +<synopsis> +<function>XMLTABLE</function> ( + <optional> <literal>XMLNAMESPACES</literal> ( <replaceable>namespace_uri</replaceable> <literal>AS</literal> <replaceable>namespace_name</replaceable> <optional>, ...</optional> ), </optional> + <replaceable>row_expression</replaceable> <literal>PASSING</literal> <optional><literal>BY</literal> {<literal>REF</literal>|<literal>VALUE</literal>}</optional> <replaceable>document_expression</replaceable> <optional><literal>BY</literal> {<literal>REF</literal>|<literal>VALUE</literal>}</optional> + <literal>COLUMNS</literal> <replaceable>name</replaceable> { <replaceable>type</replaceable> <optional><literal>PATH</literal> <replaceable>column_expression</replaceable></optional> <optional><literal>DEFAULT</literal> <replaceable>default_expression</replaceable></optional> <optional><literal>NOT NULL</literal> | <literal>NULL</literal></optional> + | <literal>FOR ORDINALITY</literal> } + <optional>, ...</optional> +) <returnvalue>setof record</returnvalue> +</synopsis> + + <para> + The <function>xmltable</function> expression produces a table based + on an XML value, an XPath filter to extract rows, and a + set of column definitions. + Although it syntactically resembles a function, it can only appear + as a table in a query's <literal>FROM</literal> clause. + </para> + + <para> + The optional <literal>XMLNAMESPACES</literal> clause gives a + comma-separated list of namespace definitions, where + each <replaceable>namespace_uri</replaceable> is a <type>text</type> + expression and each <replaceable>namespace_name</replaceable> is a simple + identifier. It specifies the XML namespaces used in the document and + their aliases. A default namespace specification is not currently + supported. + </para> + + <para> + The required <replaceable>row_expression</replaceable> argument is an + XPath 1.0 expression (given as <type>text</type>) that is evaluated, + passing the XML value <replaceable>document_expression</replaceable> as + its context item, to obtain a set of XML nodes. These nodes are what + <function>xmltable</function> transforms into output rows. No rows + will be produced if the <replaceable>document_expression</replaceable> + is null, nor if the <replaceable>row_expression</replaceable> produces + an empty node-set or any value other than a node-set. + </para> + + <para> + <replaceable>document_expression</replaceable> provides the context + item for the <replaceable>row_expression</replaceable>. It must be a + well-formed XML document; fragments/forests are not accepted. + The <literal>BY REF</literal> and <literal>BY VALUE</literal> clauses + are accepted but ignored, as discussed in + <xref linkend="functions-xml-limits-postgresql"/>. + </para> + + <para> + In the SQL standard, the <function>xmltable</function> function + evaluates expressions in the XML Query language, + but <productname>PostgreSQL</productname> allows only XPath 1.0 + expressions, as discussed in + <xref linkend="functions-xml-limits-xpath1"/>. + </para> + + <para> + The required <literal>COLUMNS</literal> clause specifies the + column(s) that will be produced in the output table. + See the syntax summary above for the format. + A name is required for each column, as is a data type + (unless <literal>FOR ORDINALITY</literal> is specified, in which case + type <type>integer</type> is implicit). The path, default and + nullability clauses are optional. + </para> + + <para> + A column marked <literal>FOR ORDINALITY</literal> will be populated + with row numbers, starting with 1, in the order of nodes retrieved from + the <replaceable>row_expression</replaceable>'s result node-set. + At most one column may be marked <literal>FOR ORDINALITY</literal>. + </para> + + <note> + <para> + XPath 1.0 does not specify an order for nodes in a node-set, so code + that relies on a particular order of the results will be + implementation-dependent. Details can be found in + <xref linkend="xml-xpath-1-specifics"/>. + </para> + </note> + + <para> + The <replaceable>column_expression</replaceable> for a column is an + XPath 1.0 expression that is evaluated for each row, with the current + node from the <replaceable>row_expression</replaceable> result as its + context item, to find the value of the column. If + no <replaceable>column_expression</replaceable> is given, then the + column name is used as an implicit path. + </para> + + <para> + If a column's XPath expression returns a non-XML value (which is limited + to string, boolean, or double in XPath 1.0) and the column has a + PostgreSQL type other than <type>xml</type>, the column will be set + as if by assigning the value's string representation to the PostgreSQL + type. (If the value is a boolean, its string representation is taken + to be <literal>1</literal> or <literal>0</literal> if the output + column's type category is numeric, otherwise <literal>true</literal> or + <literal>false</literal>.) + </para> + + <para> + If a column's XPath expression returns a non-empty set of XML nodes + and the column's PostgreSQL type is <type>xml</type>, the column will + be assigned the expression result exactly, if it is of document or + content form. + <footnote> + <para> + A result containing more than one element node at the top level, or + non-whitespace text outside of an element, is an example of content form. + An XPath result can be of neither form, for example if it returns an + attribute node selected from the element that contains it. Such a result + will be put into content form with each such disallowed node replaced by + its string value, as defined for the XPath 1.0 + <function>string</function> function. + </para> + </footnote> + </para> + + <para> + A non-XML result assigned to an <type>xml</type> output column produces + content, a single text node with the string value of the result. + An XML result assigned to a column of any other type may not have more than + one node, or an error is raised. If there is exactly one node, the column + will be set as if by assigning the node's string + value (as defined for the XPath 1.0 <function>string</function> function) + to the PostgreSQL type. + </para> + + <para> + The string value of an XML element is the concatenation, in document order, + of all text nodes contained in that element and its descendants. The string + value of an element with no descendant text nodes is an + empty string (not <literal>NULL</literal>). + Any <literal>xsi:nil</literal> attributes are ignored. + Note that the whitespace-only <literal>text()</literal> node between two non-text + elements is preserved, and that leading whitespace on a <literal>text()</literal> + node is not flattened. + The XPath 1.0 <function>string</function> function may be consulted for the + rules defining the string value of other XML node types and non-XML values. + </para> + + <para> + The conversion rules presented here are not exactly those of the SQL + standard, as discussed in <xref linkend="functions-xml-limits-casts"/>. + </para> + + <para> + If the path expression returns an empty node-set + (typically, when it does not match) + for a given row, the column will be set to <literal>NULL</literal>, unless + a <replaceable>default_expression</replaceable> is specified; then the + value resulting from evaluating that expression is used. + </para> + + <para> + A <replaceable>default_expression</replaceable>, rather than being + evaluated immediately when <function>xmltable</function> is called, + is evaluated each time a default is needed for the column. + If the expression qualifies as stable or immutable, the repeat + evaluation may be skipped. + This means that you can usefully use volatile functions like + <function>nextval</function> in + <replaceable>default_expression</replaceable>. + </para> + + <para> + Columns may be marked <literal>NOT NULL</literal>. If the + <replaceable>column_expression</replaceable> for a <literal>NOT + NULL</literal> column does not match anything and there is + no <literal>DEFAULT</literal> or + the <replaceable>default_expression</replaceable> also evaluates to null, + an error is reported. + </para> + + <para> + Examples: + <screen><![CDATA[ +CREATE TABLE xmldata AS SELECT +xml $$ +<ROWS> + <ROW id="1"> + <COUNTRY_ID>AU</COUNTRY_ID> + <COUNTRY_NAME>Australia</COUNTRY_NAME> + </ROW> + <ROW id="5"> + <COUNTRY_ID>JP</COUNTRY_ID> + <COUNTRY_NAME>Japan</COUNTRY_NAME> + <PREMIER_NAME>Shinzo Abe</PREMIER_NAME> + <SIZE unit="sq_mi">145935</SIZE> + </ROW> + <ROW id="6"> + <COUNTRY_ID>SG</COUNTRY_ID> + <COUNTRY_NAME>Singapore</COUNTRY_NAME> + <SIZE unit="sq_km">697</SIZE> + </ROW> +</ROWS> +$$ AS data; + +SELECT xmltable.* + FROM xmldata, + XMLTABLE('//ROWS/ROW' + PASSING data + COLUMNS id int PATH '@id', + ordinality FOR ORDINALITY, + "COUNTRY_NAME" text, + country_id text PATH 'COUNTRY_ID', + size_sq_km float PATH 'SIZE[@unit = "sq_km"]', + size_other text PATH + 'concat(SIZE[@unit!="sq_km"], " ", SIZE[@unit!="sq_km"]/@unit)', + premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified'); + + id | ordinality | COUNTRY_NAME | country_id | size_sq_km | size_other | premier_name +----+------------+--------------+------------+------------+--------------+--------------- + 1 | 1 | Australia | AU | | | not specified + 5 | 2 | Japan | JP | | 145935 sq_mi | Shinzo Abe + 6 | 3 | Singapore | SG | 697 | | not specified +]]></screen> + + The following example shows concatenation of multiple text() nodes, + usage of the column name as XPath filter, and the treatment of whitespace, + XML comments and processing instructions: + + <screen><![CDATA[ +CREATE TABLE xmlelements AS SELECT +xml $$ + <root> + <element> Hello<!-- xyxxz -->2a2<?aaaaa?> <!--x--> bbb<x>xxx</x>CC </element> + </root> +$$ AS data; + +SELECT xmltable.* + FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text); + element +------------------------- + Hello2a2 bbbxxxCC +]]></screen> + </para> + + <para> + The following example illustrates how + the <literal>XMLNAMESPACES</literal> clause can be used to specify + a list of namespaces + used in the XML document as well as in the XPath expressions: + + <screen><![CDATA[ +WITH xmldata(data) AS (VALUES (' +<example xmlns="http://example.com/myns" xmlns:B="http://example.com/b"> + <item foo="1" B:bar="2"/> + <item foo="3" B:bar="4"/> + <item foo="4" B:bar="5"/> +</example>'::xml) +) +SELECT xmltable.* + FROM XMLTABLE(XMLNAMESPACES('http://example.com/myns' AS x, + 'http://example.com/b' AS "B"), + '/x:example/x:item' + PASSING (SELECT data FROM xmldata) + COLUMNS foo int PATH '@foo', + bar int PATH '@B:bar'); + foo | bar +-----+----- + 1 | 2 + 3 | 4 + 4 | 5 +(3 rows) +]]></screen> + </para> + </sect3> + </sect2> + + <sect2 id="functions-xml-mapping"> + <title>Mapping Tables to XML</title> + + <indexterm zone="functions-xml-mapping"> + <primary>XML export</primary> + </indexterm> + + <para> + The following functions map the contents of relational tables to + XML values. They can be thought of as XML export functionality: +<synopsis> +<function>table_to_xml</function> ( <parameter>table</parameter> <type>regclass</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>query_to_xml</function> ( <parameter>query</parameter> <type>text</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>cursor_to_xml</function> ( <parameter>cursor</parameter> <type>refcursor</type>, <parameter>count</parameter> <type>integer</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + </para> + + <para> + <function>table_to_xml</function> maps the content of the named + table, passed as parameter <parameter>table</parameter>. The + <type>regclass</type> type accepts strings identifying tables using the + usual notation, including optional schema qualification and + double quotes (see <xref linkend="datatype-oid"/> for details). + <function>query_to_xml</function> executes the + query whose text is passed as parameter + <parameter>query</parameter> and maps the result set. + <function>cursor_to_xml</function> fetches the indicated number of + rows from the cursor specified by the parameter + <parameter>cursor</parameter>. This variant is recommended if + large tables have to be mapped, because the result value is built + up in memory by each function. + </para> + + <para> + If <parameter>tableforest</parameter> is false, then the resulting + XML document looks like this: +<screen><![CDATA[ +<tablename> + <row> + <columnname1>data</columnname1> + <columnname2>data</columnname2> + </row> + + <row> + ... + </row> + + ... +</tablename> +]]></screen> + + If <parameter>tableforest</parameter> is true, the result is an + XML content fragment that looks like this: +<screen><![CDATA[ +<tablename> + <columnname1>data</columnname1> + <columnname2>data</columnname2> +</tablename> + +<tablename> + ... +</tablename> + +... +]]></screen> + + If no table name is available, that is, when mapping a query or a + cursor, the string <literal>table</literal> is used in the first + format, <literal>row</literal> in the second format. + </para> + + <para> + The choice between these formats is up to the user. The first + format is a proper XML document, which will be important in many + applications. The second format tends to be more useful in the + <function>cursor_to_xml</function> function if the result values are to be + reassembled into one document later on. The functions for + producing XML content discussed above, in particular + <function>xmlelement</function>, can be used to alter the results + to taste. + </para> + + <para> + The data values are mapped in the same way as described for the + function <function>xmlelement</function> above. + </para> + + <para> + The parameter <parameter>nulls</parameter> determines whether null + values should be included in the output. If true, null values in + columns are represented as: +<screen><![CDATA[ +<columnname xsi:nil="true"/> +]]></screen> + where <literal>xsi</literal> is the XML namespace prefix for XML + Schema Instance. An appropriate namespace declaration will be + added to the result value. If false, columns containing null + values are simply omitted from the output. + </para> + + <para> + The parameter <parameter>targetns</parameter> specifies the + desired XML namespace of the result. If no particular namespace + is wanted, an empty string should be passed. + </para> + + <para> + The following functions return XML Schema documents describing the + mappings performed by the corresponding functions above: +<synopsis> +<function>table_to_xmlschema</function> ( <parameter>table</parameter> <type>regclass</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>query_to_xmlschema</function> ( <parameter>query</parameter> <type>text</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>cursor_to_xmlschema</function> ( <parameter>cursor</parameter> <type>refcursor</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + It is essential that the same parameters are passed in order to + obtain matching XML data mappings and XML Schema documents. + </para> + + <para> + The following functions produce XML data mappings and the + corresponding XML Schema in one document (or forest), linked + together. They can be useful where self-contained and + self-describing results are wanted: +<synopsis> +<function>table_to_xml_and_xmlschema</function> ( <parameter>table</parameter> <type>regclass</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>query_to_xml_and_xmlschema</function> ( <parameter>query</parameter> <type>text</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + </para> + + <para> + In addition, the following functions are available to produce + analogous mappings of entire schemas or the entire current + database: +<synopsis> +<function>schema_to_xml</function> ( <parameter>schema</parameter> <type>name</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>schema_to_xmlschema</function> ( <parameter>schema</parameter> <type>name</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>schema_to_xml_and_xmlschema</function> ( <parameter>schema</parameter> <type>name</type>, <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> + +<function>database_to_xml</function> ( <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>database_to_xmlschema</function> ( <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +<function>database_to_xml_and_xmlschema</function> ( <parameter>nulls</parameter> <type>boolean</type>, + <parameter>tableforest</parameter> <type>boolean</type>, <parameter>targetns</parameter> <type>text</type> ) <returnvalue>xml</returnvalue> +</synopsis> + + These functions ignore tables that are not readable by the current user. + The database-wide functions additionally ignore schemas that the current + user does not have <literal>USAGE</literal> (lookup) privilege for. + </para> + + <para> + Note that these potentially produce a lot of data, which needs to + be built up in memory. When requesting content mappings of large + schemas or databases, it might be worthwhile to consider mapping the + tables separately instead, possibly even through a cursor. + </para> + + <para> + The result of a schema content mapping looks like this: + +<screen><![CDATA[ +<schemaname> + +table1-mapping + +table2-mapping + +... + +</schemaname>]]></screen> + + where the format of a table mapping depends on the + <parameter>tableforest</parameter> parameter as explained above. + </para> + + <para> + The result of a database content mapping looks like this: + +<screen><![CDATA[ +<dbname> + +<schema1name> + ... +</schema1name> + +<schema2name> + ... +</schema2name> + +... + +</dbname>]]></screen> + + where the schema mapping is as above. + </para> + + <para> + As an example of using the output produced by these functions, + <xref linkend="xslt-xml-html"/> shows an XSLT stylesheet that + converts the output of + <function>table_to_xml_and_xmlschema</function> to an HTML + document containing a tabular rendition of the table data. In a + similar manner, the results from these functions can be + converted into other XML-based formats. + </para> + + <example id="xslt-xml-html"> + <title>XSLT Stylesheet for Converting SQL/XML Output to HTML</title> +<programlisting><![CDATA[ +<?xml version="1.0"?> +<xsl:stylesheet version="1.0" + xmlns:xsl="http://www.w3.org/1999/XSL/Transform" + xmlns:xsd="http://www.w3.org/2001/XMLSchema" + xmlns="http://www.w3.org/1999/xhtml" +> + + <xsl:output method="xml" + doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" + doctype-public="-//W3C/DTD XHTML 1.0 Strict//EN" + indent="yes"/> + + <xsl:template match="/*"> + <xsl:variable name="schema" select="//xsd:schema"/> + <xsl:variable name="tabletypename" + select="$schema/xsd:element[@name=name(current())]/@type"/> + <xsl:variable name="rowtypename" + select="$schema/xsd:complexType[@name=$tabletypename]/xsd:sequence/xsd:element[@name='row']/@type"/> + + <html> + <head> + <title><xsl:value-of select="name(current())"/></title> + </head> + <body> + <table> + <tr> + <xsl:for-each select="$schema/xsd:complexType[@name=$rowtypename]/xsd:sequence/xsd:element/@name"> + <th><xsl:value-of select="."/></th> + </xsl:for-each> + </tr> + + <xsl:for-each select="row"> + <tr> + <xsl:for-each select="*"> + <td><xsl:value-of select="."/></td> + </xsl:for-each> + </tr> + </xsl:for-each> + </table> + </body> + </html> + </xsl:template> + +</xsl:stylesheet> +]]></programlisting> + </example> + </sect2> + </sect1> |