diff options
author | Bruce Momjian <bruce@momjian.us> | 2007-11-10 23:30:46 +0000 |
---|---|---|
committer | Bruce Momjian <bruce@momjian.us> | 2007-11-10 23:30:46 +0000 |
commit | c3c69ab4fd25a20749b850d34cbc8ce3f1812e3b (patch) | |
tree | a7d9b4501297216ac71f597dca3840090ae791d5 /contrib/xml2 | |
parent | 6e414a171e8a91966b10ecd14aa367422870bdd2 (diff) | |
download | postgresql-c3c69ab4fd25a20749b850d34cbc8ce3f1812e3b.tar.gz postgresql-c3c69ab4fd25a20749b850d34cbc8ce3f1812e3b.zip |
Move most /contrib README files into SGML. Some still need conversion
or will never be converted.
Diffstat (limited to 'contrib/xml2')
-rw-r--r-- | contrib/xml2/README.xml2 | 278 |
1 files changed, 0 insertions, 278 deletions
diff --git a/contrib/xml2/README.xml2 b/contrib/xml2/README.xml2 deleted file mode 100644 index 28d3db0e006..00000000000 --- a/contrib/xml2/README.xml2 +++ /dev/null @@ -1,278 +0,0 @@ -XML-handling functions for PostgreSQL -===================================== - - DEPRECATION NOTICE: From PostgreSQL 8.3 on, there is XML-related - functionality based on the SQL/XML standard in the core server. - That functionality covers XML syntax checking and XPath queries, - which is what this module does as well, and more, but the API is - not at all compatible. It is planned that this module will be - removed in PostgreSQL 8.4 in favor of the newer standard API, so - you are encouraged to try converting your applications. If you - find that some of the functionality of this module is not - available in an adequate form with the newer API, please explain - your issue to pgsql-hackers@postgresql.org so that the deficiency - can be addressed. - -- Peter Eisentraut, 2007-05-24 - -Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com) -It has the same BSD licence as PostgreSQL. - -This version of the XML functions provides both XPath querying and -XSLT functionality. There is also a new table function which allows -the straightforward return of multiple XML results. Note that the current code -doesn't take any particular care over character sets - this is -something that should be fixed at some point! - -Installation ------------- - -The current build process will only work if the files are in -contrib/xml2 in a PostgreSQL 7.3 or later source tree which has been -configured and built (If you alter the subdir value in the Makefile -you can place it in a different directory in a PostgreSQL tree). - -Before you begin, just check the Makefile, and then just 'make' and -'make install'. - -By default, this module requires both libxml2 and libxslt to be installed -on your system. If you do not have libxslt or do not want to use XSLT -functions, you must edit the Makefile to not build the XSLT functions, -as directed in its comments; and edit pgxml.sql.in to remove the XSLT -function declarations, as directed in its comments. - -Description of functions ------------------------- - -The first set of functions are straightforward XML parsing and XPath queries: - -xml_is_well_formed(document) RETURNS bool - -This parses the document text in its parameter and returns true if the -document is well-formed XML. (Note: before PostgreSQL 8.2, this function -was called xml_valid(). That is the wrong name since validity and -well-formedness have different meanings in XML. The old name is still -available, but is deprecated and will be removed in 8.3.) - -xpath_string(document,query) RETURNS text -xpath_number(document,query) RETURNS float4 -xpath_bool(document,query) RETURNS bool - -These functions evaluate the XPath query on the supplied document, and -cast the result to the specified type. - - -xpath_nodeset(document,query,toptag,itemtag) RETURNS text - -This evaluates query on document and wraps the result in XML tags. If -the result is multivalued, the output will look like: - -<toptag> -<itemtag>Value 1 which could be an XML fragment</itemtag> -<itemtag>Value 2....</itemtag> -</toptag> - -If either toptag or itemtag is an empty string, the relevant tag is omitted. -There are also wrapper functions for this operation: - -xpath_nodeset(document,query) RETURNS text omits both tags. -xpath_nodeset(document,query,itemtag) RETURNS text omits toptag. - - -xpath_list(document,query,seperator) RETURNS text - -This function returns multiple values seperated by the specified -seperator, e.g. Value 1,Value 2,Value 3 if seperator=','. - -xpath_list(document,query) RETURNS text - -This is a wrapper for the above function that uses ',' as the seperator. - - -xpath_table ------------ - -This is a table function which evaluates a set of XPath queries on -each of a set of documents and returns the results as a table. The -primary key field from the original document table is returned as the -first column of the result so that the resultset from xpath_table can -be readily used in joins. - -The function itself takes 5 arguments, all text. - -xpath_table(key,document,relation,xpaths,criteria) - -key - the name of the "key" field - this is just a field to be used as -the first column of the output table i.e. it identifies the record from -which each output row came (see note below about multiple values). - -document - the name of the field containing the XML document - -relation - the name of the table or view containing the documents - -xpaths - multiple xpath expressions separated by | - -criteria - The contents of the where clause. This needs to be specified, -so use "true" or "1=1" here if you want to process all the rows in the -relation. - -NB These parameters (except the XPath strings) are just substituted -into a plain SQL SELECT statement, so you have some flexibility - the -statement is - -SELECT <key>,<document> FROM <relation> WHERE <criteria> - -so those parameters can be *anything* valid in those particular -locations. The result from this SELECT needs to return exactly two -columns (which it will unless you try to list multiple fields for key -or document). Beware that this simplistic approach requires that you -validate any user-supplied values to avoid SQL injection attacks. - -Using the function - -The function has to be used in a FROM expression. This gives the following -form: - -SELECT * FROM -xpath_table('article_id', - 'article_xml', - 'articles', - '/article/author|/article/pages|/article/title', - 'date_entered > ''2003-01-01'' ') -AS t(article_id integer, author text, page_count integer, title text); - -The AS clause defines the names and types of the columns in the -virtual table. If there are more XPath queries than result columns, -the extra queries will be ignored. If there are more result columns -than XPath queries, the extra columns will be NULL. - -Note that I've said in this example that pages is an integer. The -function deals internally with string representations, so when you say -you want an integer in the output, it will take the string -representation of the XPath result and use PostgreSQL input functions -to transform it into an integer (or whatever type the AS clause -requests). An error will result if it can't do this - for example if -the result is empty - so you may wish to just stick to 'text' as the -column type if you think your data has any problems. - -The select statement doesn't need to use * alone - it can reference the -columns by name or join them to other tables. The function produces a -virtual table with which you can perform any operation you wish (e.g. -aggregation, joining, sorting etc). So we could also have: - -SELECT t.title, p.fullname, p.email -FROM xpath_table('article_id','article_xml','articles', - '/article/title|/article/author/@id', - 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ') - AS t(article_id integer, title text, author_id integer), - tblPeopleInfo AS p -WHERE t.author_id = p.person_id; - -as a more complicated example. Of course, you could wrap all -of this in a view for convenience. - -Multivalued results - -The xpath_table function assumes that the results of each XPath query -might be multi-valued, so the number of rows returned by the function -may not be the same as the number of input documents. The first row -returned contains the first result from each query, the second row the -second result from each query. If one of the queries has fewer values -than the others, NULLs will be returned instead. - -In some cases, a user will know that a given XPath query will return -only a single result (perhaps a unique document identifier) - if used -alongside an XPath query returning multiple results, the single-valued -result will appear only on the first row of the result. The solution -to this is to use the key field as part of a join against a simpler -XPath query. As an example: - - -CREATE TABLE test -( - id int4 NOT NULL, - xml text, - CONSTRAINT pk PRIMARY KEY (id) -) -WITHOUT OIDS; - -INSERT INTO test VALUES (1, '<doc num="C1"> -<line num="L1"><a>1</a><b>2</b><c>3</c></line> -<line num="L2"><a>11</a><b>22</b><c>33</c></line> -</doc>'); - -INSERT INTO test VALUES (2, '<doc num="C2"> -<line num="L1"><a>111</a><b>222</b><c>333</c></line> -<line num="L2"><a>111</a><b>222</b><c>333</c></line> -</doc>'); - - -The query: - -SELECT * FROM xpath_table('id','xml','test', -'/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1') -AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4, -val2 int4, val3 int4) -WHERE id = 1 ORDER BY doc_num, line_num - - -Gives the result: - - id | doc_num | line_num | val1 | val2 | val3 -----+---------+----------+------+------+------ - 1 | C1 | L1 | 1 | 2 | 3 - 1 | | L2 | 11 | 22 | 33 - -To get doc_num on every line, the solution is to use two invocations -of xpath_table and join the results: - -SELECT t.*,i.doc_num FROM - xpath_table('id','xml','test', - '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1') - AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4), - xpath_table('id','xml','test','/doc/@num','1=1') - AS i(id int4, doc_num varchar(10)) -WHERE i.id=t.id AND i.id=1 -ORDER BY doc_num, line_num; - -which gives the desired result: - - id | line_num | val1 | val2 | val3 | doc_num -----+----------+------+------+------+--------- - 1 | L1 | 1 | 2 | 3 | C1 - 1 | L2 | 11 | 22 | 33 | C1 -(2 rows) - - - -XSLT functions --------------- - -The following functions are available if libxslt is installed (this is -not currently detected automatically, so you will have to amend the -Makefile) - -xslt_process(document,stylesheet,paramlist) RETURNS text - -This function appplies the XSL stylesheet to the document and returns -the transformed result. The paramlist is a list of parameter -assignments to be used in the transformation, specified in the form -'a=1,b=2'. Note that this is also proof-of-concept code and the -parameter parsing is very simple-minded (e.g. parameter values cannot -contain commas!) - -Also note that if either the document or stylesheet values do not -begin with a < then they will be treated as URLs and libxslt will -fetch them. It thus follows that you can use xslt_process as a means -to fetch the contents of URLs - you should be aware of the security -implications of this. - -There is also a two-parameter version of xslt_process which does not -pass any parameters to the transformation. - - -Feedback --------- - -If you have any comments or suggestions, please do contact me at -jgray@azuli.co.uk. Unfortunately, this isn't my main job, so I can't -guarantee a rapid response to your query! |