diff options
Diffstat (limited to 'doc/manual/xfunc.html')
-rw-r--r-- | doc/manual/xfunc.html | 474 |
1 files changed, 474 insertions, 0 deletions
diff --git a/doc/manual/xfunc.html b/doc/manual/xfunc.html new file mode 100644 index 00000000000..557e9ec0bf1 --- /dev/null +++ b/doc/manual/xfunc.html @@ -0,0 +1,474 @@ +<HTML> +<HEAD> + <TITLE>The POSTGRES95 User Manual - EXTENDING SQL: FUNCTIONS</TITLE> +</HEAD> + +<BODY> + +<font size=-1> +<A HREF="pg95user.html">[ TOC ]</A> +<A HREF="extend.html">[ Previous ]</A> +<A HREF="xtypes.html">[ Next ]</A> +</font> +<HR> +<H1>7. EXTENDING <B>SQL</B>: FUNCTIONS</H1> +<HR> + As it turns out, part of defining a new type is the + definition of functions that describe its behavior. + Consequently, while it is possible to define a new + function without defining a new type, the reverse is + not true. We therefore describe how to add new functions + to POSTGRES before describing how to add new + types. + POSTGRES <B>SQL</B> provides two types of functions: query + language functions (functions written in <B>SQL</B> and + programming language functions (functions written in a + compiled programming language such as <B>C</B>.) Either kind + of function can take a base type, a composite type or + some combination as arguments (parameters). In addition, + both kinds of functions can return a base type or + a composite type. It's easier to define <B>SQL</B> functions, + so we'll start with those. + Examples in this section can also be found in <CODE>funcs.sql</CODE> + and <CODE>C-code/funcs.c</CODE>. +<p> +<H2><A NAME="query-language-sql-functions">7.1. Query Language (<B>SQL</B>) Functions</A></H2> + +<H3><A NAME="sql-functions-on-base-types">7.1.1. <B>SQL</B> Functions on Base Types</A></H3> + The simplest possible <B>SQL</B> function has no arguments and + simply returns a base type, such as <B>int4</B>: + +<pre> CREATE FUNCTION one() RETURNS int4 + AS 'SELECT 1 as RESULT' LANGUAGE 'sql'; + + + SELECT one() AS answer; + + +-------+ + |answer | + +-------+ + |1 | + +-------+ +</pre> + Notice that we defined a target list for the function + (with the name RESULT), but the target list of the + query that invoked the function overrode the function's + target list. Hence, the result is labelled answer + instead of one. +<p> + It's almost as easy to define <B>SQL</B> functions that take + base types as arguments. In the example below, notice + how we refer to the arguments within the function as $1 + and $2. + +<pre> CREATE FUNCTION add_em(int4, int4) RETURNS int4 + AS 'SELECT $1 + $2;' LANGUAGE 'sql'; + + + SELECT add_em(1, 2) AS answer; + + + +-------+ + |answer | + +-------+ + |3 | + +-------+ +</pre> + +<H3>7.1.2. <B>SQL</B> Functions on Composite Types</H3> + When specifying functions with arguments of composite + types (such as EMP), we must not only specify which + argument we want (as we did above with $1 and $2) but + also the attributes of that argument. For example, + take the function double_salary that computes what your + salary would be if it were doubled. + +<pre> CREATE FUNCTION double_salary(EMP) RETURNS int4 + AS 'SELECT $1.salary * 2 AS salary;' LANGUAGE 'sql'; + + SELECT name, double_salary(EMP) AS dream + FROM EMP + WHERE EMP.dept = 'toy'; + + + +-----+-------+ + |name | dream | + +-----+-------+ + |Sam | 2400 | + +-----+-------+ +</pre> + Notice the use of the syntax $1.salary. + Before launching into the subject of functions that + return composite types, we must first introduce the + function notation for projecting attributes. The simple way + to explain this is that we can usually use the + notation attribute(class) and class.attribute interchangably. + +<pre> -- + -- this is the same as: + -- SELECT EMP.name AS youngster FROM EMP WHERE EMP.age < 30 + -- + SELECT name(EMP) AS youngster + FROM EMP + WHERE age(EMP) < 30; + + + +----------+ + |youngster | + +----------+ + |Sam | + +----------+ +</pre> + As we shall see, however, this is not always the case. + This function notation is important when we want to use + a function that returns a single instance. We do this + by assembling the entire instance within the function, + attribute by attribute. This is an example of a function + that returns a single EMP instance: + +<pre> CREATE FUNCTION new_emp() RETURNS EMP + AS 'SELECT \'None\'::text AS name, + 1000 AS salary, + 25 AS age, + \'none\'::char16 AS dept;' + LANGUAGE 'sql'; +</pre> + + In this case we have specified each of the attributes + with a constant value, but any computation or expression + could have been substituted for these constants. + Defining a function like this can be tricky. Some of + the more important caveats are as follows: + + + <UL> + <LI>The target list order must be exactly the same as + that in which the attributes appear in the <B>CREATE + TABLE</B> statement (or when you execute a .* query). + <LI>You must be careful to typecast the expressions + (using ::) very carefully or you will see the following error: + +<pre> WARN::function declared to return type EMP does not retrieve (EMP.*) +</pre> + <LI>When calling a function that returns an instance, we + cannot retrieve the entire instance. We must either + project an attribute out of the instance or pass the + entire instance into another function. +<pre> SELECT name(new_emp()) AS nobody; + + + +-------+ + |nobody | + +-------+ + |None | + +-------+ +</pre> + <LI>The reason why, in general, we must use the function + syntax for projecting attributes of function return + values is that the parser just doesn't understand + the other (dot) syntax for projection when combined + with function calls. + +<pre> SELECT new_emp().name AS nobody; + WARN:parser: syntax error at or near "." +</pre> + </UL> + + Any collection of commands in the <B>SQL</B> query language + can be packaged together and defined as a function. + The commands can include updates (i.e., <B>insert</B>, <B>update</B> + and <B>delete</B>) as well as <B>select</B> queries. However, the + final command must be a <B>select</B> that returns whatever is + specified as the function's returntype. + +<pre> + CREATE FUNCTION clean_EMP () RETURNS int4 + AS 'DELETE FROM EMP WHERE EMP.salary <= 0; + SELECT 1 AS ignore_this' + LANGUAGE 'sql'; + + SELECT clean_EMP(); + + + +--+ + |x | + +--+ + |1 | + +--+ +</pre> +<p> + +<H2><A NAME="programming-language-functions">7.2. Programming Language Functions</A></H2> +<H3><A NAME="programming-language-functions-on-base-types">7.2.1. Programming Language Functions on Base Types</A></H3> + Internally, POSTGRES regards a base type as a "blob of + memory." The user-defined functions that you define + over a type in turn define the way that POSTGRES can + operate on it. That is, POSTGRES will only store and + retrieve the data from disk and use your user-defined + functions to input, process, and output the data. + Base types can have one of three internal formats: + <UL> + <LI>pass by value, fixed-length + <LI>pass by reference, fixed-length + <LI>pass by reference, variable-length + </UL> + By-value types can only be 1, 2 or 4 bytes in length + (even if your computer supports by-value types of other + sizes). POSTGRES itself only passes integer types by + value. You should be careful to define your types such + that they will be the same size (in bytes) on all + architectures. For example, the <B>long</B> type is dangerous + because it is 4 bytes on some machines and 8 bytes on + others, whereas <B>int</B> type is 4 bytes on most <B>UNIX</B> + machines (though not on most personal computers). A + reasonable implementation of the <B>int4</B> type on <B>UNIX</B> + machines might be: + +<pre> /* 4-byte integer, passed by value */ + typedef int int4; +</pre> + + On the other hand, fixed-length types of any size may + be passed by-reference. For example, here is a sample + implementation of the POSTGRES char16 type: + +<pre> /* 16-byte structure, passed by reference */ + typedef struct { + char data[16]; + } char16; +</pre> + + Only pointers to such types can be used when passing + them in and out of POSTGRES functions. + Finally, all variable-length types must also be passed + by reference. All variable-length types must begin + with a length field of exactly 4 bytes, and all data to + be stored within that type must be located in the memory + immediately following that length field. The + length field is the total length of the structure + (i.e., it includes the size of the length field + itself). We can define the text type as follows: + +<pre> typedef struct { + int4 length; + char data[1]; + } text; +</pre> + + Obviously, the data field is not long enough to hold + all possible strings -- it's impossible to declare such + a structure in <B>C</B>. When manipulating variable-length + types, we must be careful to allocate the correct + amount of memory and initialize the length field. For + example, if we wanted to store 40 bytes in a text + structure, we might use a code fragment like this: + +<pre> #include "postgres.h" + #include "utils/palloc.h" + + ... + + char buffer[40]; /* our source data */ + + ... + + text *destination = (text *) palloc(VARHDRSZ + 40); + destination->length = VARHDRSZ + 40; + memmove(destination->data, buffer, 40); + + ... + +</pre> + Now that we've gone over all of the possible structures + for base types, we can show some examples of real functions. + Suppose <CODE>funcs.c</CODE> look like: + +<pre> #include <string.h> + #include "postgres.h" /* for char16, etc. */ + #include "utils/palloc.h" /* for palloc */ + + int + add_one(int arg) + { + return(arg + 1); + } + + char16 * + concat16(char16 *arg1, char16 *arg2) + { + char16 *new_c16 = (char16 *) palloc(sizeof(char16)); + + memset((void *) new_c16, 0, sizeof(char16)); + (void) strncpy(new_c16, arg1, 16); + return (char16 *)(strncat(new_c16, arg2, 16)); + } +<p> + text * + copytext(text *t) + { + /* + * VARSIZE is the total size of the struct in bytes. + */ + text *new_t = (text *) palloc(VARSIZE(t)); +<p> + memset(new_t, 0, VARSIZE(t)); +<p> + VARSIZE(new_t) = VARSIZE(t); + /* + * VARDATA is a pointer to the data region of the struct. + */ + memcpy((void *) VARDATA(new_t), /* destination */ + (void *) VARDATA(t), /* source */ + VARSIZE(t)-VARHDRSZ); /* how many bytes */ +<p> + return(new_t); + } +</pre> + On <B>OSF/1</B> we would type: + +<pre> CREATE FUNCTION add_one(int4) RETURNS int4 + AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c'; + + CREATE FUNCTION concat16(char16, char16) RETURNS char16 + AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c'; + + CREATE FUNCTION copytext(text) RETURNS text + AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c'; +</pre> + + On other systems, we might have to make the filename + end in .sl (to indicate that it's a shared library). +<p> +<H3><A NAME="programming-language-functions-on-composite-types">7.2.2. Programming Language Functions on Composite Types</A></H3> + Composite types do not have a fixed layout like C + structures. Instances of a composite type may contain + null fields. In addition, composite types that are + part of an inheritance hierarchy may have different + fields than other members of the same inheritance hierarchy. + Therefore, POSTGRES provides a procedural + interface for accessing fields of composite types from + C. + As POSTGRES processes a set of instances, each instance + will be passed into your function as an opaque structure of type <B>TUPLE</B>. + Suppose we want to write a function to answer the query + +<pre> * SELECT name, c_overpaid(EMP, 1500) AS overpaid + FROM EMP + WHERE name = 'Bill' or name = 'Sam'; +</pre> + In the query above, we can define c_overpaid as: + +<pre> #include "postgres.h" /* for char16, etc. */ + #include "libpq-fe.h" /* for TUPLE */ +<p> + bool + c_overpaid(TUPLE t,/* the current instance of EMP */ + int4 limit) + { + bool isnull = false; + int4 salary; +<p> + salary = (int4) GetAttributeByName(t, "salary", &isnull); +<p> + if (isnull) + return (false); + return(salary > limit); + } +</pre> + + <B>GetAttributeByName</B> is the POSTGRES system function that + returns attributes out of the current instance. It has + three arguments: the argument of type TUPLE passed into + the function, the name of the desired attribute, and a + return parameter that describes whether the attribute + is null. <B>GetAttributeByName</B> will align data properly + so you can cast its return value to the desired type. + For example, if you have an attribute name which is of + the type char16, the <B>GetAttributeByName</B> call would look + like: + +<pre> char *str; + ... + str = (char *) GetAttributeByName(t, "name", &isnull) +</pre> + + The following query lets POSTGRES know about the + c_overpaid function: + +<pre> * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool + AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c'; +</pre> + While there are ways to construct new instances or modify + existing instances from within a C function, these + are far too complex to discuss in this manual. +<p> +<H3><A NAME="caveats">7.2.3. Caveats</A></H3> + We now turn to the more difficult task of writing + programming language functions. Be warned: this section + of the manual will not make you a programmer. You must + have a good understanding of <B>C</B> (including the use of + pointers and the malloc memory manager) before trying + to write <B>C</B> functions for use with POSTGRES. + While it may be possible to load functions written in + languages other than <B>C</B> into POSTGRES, this is often + difficult (when it is possible at all) because other + languages, such as <B>FORTRAN</B> and <B>Pascal</B> often do not follow + the same "calling convention" as <B>C</B>. That is, other + languages do not pass argument and return values + between functions in the same way. For this reason, we + will assume that your programming language functions + are written in <B>C</B>. + The basic rules for building <B>C</B> functions are as follows: + <OL> + <LI> Most of the header (include) files for POSTGRES + should already be installed in + /usr/local/postgres95/include (see Figure 2). + You should always include + +<pre> -I/usr/local/postgres95/include +</pre> + on your cc command lines. Sometimes, you may + find that you require header files that are in + the server source itself (i.e., you need a file + we neglected to install in include). In those + cases you may need to add one or more of +<pre> + -I/usr/local/postgres95/src/backend + -I/usr/local/postgres95/src/backend/include + -I/usr/local/postgres95/src/backend/port/<PORTNAME> + -I/usr/local/postgres95/src/backend/obj +</pre> + + (where <PORTNAME> is the name of the port, e.g., + alpha or sparc). + <LI> When allocating memory, use the POSTGRES + routines palloc and pfree instead of the + corresponding <B>C</B> library routines malloc and free. + The memory allocated by palloc will be freed + automatically at the end of each transaction, + preventing memory leaks. + <LI> Always zero the bytes of your structures using + memset or bzero. Several routines (such as the + hash access method, hash join and the sort algorithm) + compute functions of the raw bits contained in + your structure. Even if you initialize all fields + of your structure, there may be + several bytes of alignment padding (holes in the + structure) that may contain garbage values. + <LI> Most of the internal POSTGRES types are declared + in postgres.h, so it's usually a good idea to + include that file as well. + <LI> Compiling and loading your object code so that + it can be dynamically loaded into POSTGRES + always requires special flags. See Appendix A + for a detailed explanation of how to do it for + your particular operating system. + </OL> +<HR> +<font size=-1> +<A HREF="pg95user.html">[ TOC ]</A> +<A HREF="extend.html">[ Previous ]</A> +<A HREF="xtypes.html">[ Next ]</A> +</font> +</BODY> +</HTML> |