diff options
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/advanced.sgml | 250 | ||||
-rw-r--r-- | doc/src/sgml/errcodes.sgml | 20 | ||||
-rw-r--r-- | doc/src/sgml/func.sgml | 274 | ||||
-rw-r--r-- | doc/src/sgml/queries.sgml | 53 | ||||
-rw-r--r-- | doc/src/sgml/query.sgml | 10 | ||||
-rw-r--r-- | doc/src/sgml/ref/select.sgml | 79 | ||||
-rw-r--r-- | doc/src/sgml/ref/select_into.sgml | 3 | ||||
-rw-r--r-- | doc/src/sgml/syntax.sgml | 97 | ||||
-rw-r--r-- | doc/src/sgml/xaggr.sgml | 9 |
9 files changed, 770 insertions, 25 deletions
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml index 2ecb2da5c56..ce8ef535dba 100644 --- a/doc/src/sgml/advanced.sgml +++ b/doc/src/sgml/advanced.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ --> <chapter id="tutorial-advanced"> <title>Advanced Features</title> @@ -240,7 +240,7 @@ COMMIT; <para> <productname>PostgreSQL</> actually treats every SQL statement as being executed within a transaction. If you do not issue a <command>BEGIN</> - command, + command, then each individual statement has an implicit <command>BEGIN</> and (if successful) <command>COMMIT</> wrapped around it. A group of statements surrounded by <command>BEGIN</> and <command>COMMIT</> @@ -265,7 +265,7 @@ COMMIT; with <command>ROLLBACK TO</>. All the transaction's database changes between defining the savepoint and rolling back to it are discarded, but changes earlier than the savepoint are kept. - </para> + </para> <para> After rolling back to a savepoint, it continues to be defined, so you can @@ -274,7 +274,7 @@ COMMIT; system can free some resources. Keep in mind that either releasing or rolling back to a savepoint will automatically release all savepoints that were defined after it. - </para> + </para> <para> All this is happening within the transaction block, so none of it @@ -282,7 +282,7 @@ COMMIT; transaction block, the committed actions become visible as a unit to other sessions, while the rolled-back actions never become visible at all. - </para> + </para> <para> Remembering the bank database, suppose we debit $100.00 from Alice's @@ -317,6 +317,242 @@ COMMIT; </sect1> + <sect1 id="tutorial-window"> + <title id="tutorial-window-title">Window Functions</title> + + <indexterm zone="tutorial-window"> + <primary>window function</primary> + </indexterm> + + <para> + A <firstterm>window function</> performs a calculation across a set of + table rows that are somehow related to the current row. This is comparable + to the type of calculation that can be done with an aggregate function. + But unlike regular aggregate functions, use of a window function does not + cause rows to become grouped into a single output row — the + rows retain their separate identities. Behind the scenes, the window + function is able to access more than just the current row of the query + result. + </para> + + <para> + Here is an example that shows how to compare each employee's salary + with the average salary in his or her department: + +<programlisting> +SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary; +</programlisting> + +<screen> + depname | empno | salary | avg +-----------+-------+--------+----------------------- + develop | 11 | 5200 | 5020.0000000000000000 + develop | 7 | 4200 | 5020.0000000000000000 + develop | 9 | 4500 | 5020.0000000000000000 + develop | 8 | 6000 | 5020.0000000000000000 + develop | 10 | 5200 | 5020.0000000000000000 + personnel | 5 | 3500 | 3700.0000000000000000 + personnel | 2 | 3900 | 3700.0000000000000000 + sales | 3 | 4800 | 4866.6666666666666667 + sales | 1 | 5000 | 4866.6666666666666667 + sales | 4 | 4800 | 4866.6666666666666667 +(10 rows) +</screen> + + The first three output columns come directly from the table + <structname>empsalary</>, and there is one output row for each row in the + table. The fourth column represents an average taken across all the table + rows that have the same <structfield>depname</> value as the current row. + (This actually is the same function as the regular <function>avg</> + aggregate function, but the <literal>OVER</> clause causes it to be + treated as a window function and computed across an appropriate set of + rows.) + </para> + + <para> + A window function call always contains an <literal>OVER</> clause + following the window function's name and argument(s). This is what + syntactically distinguishes it from a regular function or aggregate + function. The <literal>OVER</> clause determines exactly how the + rows of the query are split up for processing by the window function. + The <literal>PARTITION BY</> list within <literal>OVER</> specifies + dividing the rows into groups, or partitions, that share the same + values of the <literal>PARTITION BY</> expression(s). For each row, + the window function is computed across the rows that fall into the + same partition as the current row. + </para> + + <para> + Although <function>avg</> will produce the same result no matter + what order it processes the partition's rows in, this is not true of all + window functions. When needed, you can control that order using + <literal>ORDER BY</> within <literal>OVER</>. Here is an example: + +<programlisting> +SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary; +</programlisting> + +<screen> + depname | empno | salary | rank +-----------+-------+--------+------ + develop | 8 | 6000 | 1 + develop | 10 | 5200 | 2 + develop | 11 | 5200 | 2 + develop | 9 | 4500 | 4 + develop | 7 | 4200 | 5 + personnel | 2 | 3900 | 1 + personnel | 5 | 3500 | 2 + sales | 1 | 5000 | 1 + sales | 4 | 4800 | 2 + sales | 3 | 4800 | 2 +(10 rows) +</screen> + + As shown here, the <function>rank</> function produces a numerical rank + within the current row's partition for each distinct <literal>ORDER BY</> + value, in the order defined by the <literal>ORDER BY</> clause. + <function>rank</> needs no explicit parameter, because its behavior + is entirely determined by the <literal>OVER</> clause. + </para> + + <para> + The rows considered by a window function are those of the <quote>virtual + table</> produced by the query's <literal>FROM</> clause as filtered by its + <literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses + if any. For example, a row removed because it does not meet the + <literal>WHERE</> condition is not seen by any window function. + A query can contain multiple window functions that slice up the data + in different ways by means of different <literal>OVER</> clauses, but + they all act on the same collection of rows defined by this virtual table. + </para> + + <para> + We already saw that <literal>ORDER BY</> can be omitted if the ordering + of rows is not important. It is also possible to omit <literal>PARTITION + BY</>, in which case the window function is computed over all rows of the + virtual table; that is, there is one partition containing all the rows. + </para> + + <para> + There is another important concept associated with window functions: + for each row, there is a set of rows within its partition called its + <firstterm>window frame</>. When <literal>ORDER BY</> is omitted the + frame is always the same as the partition. If <literal>ORDER BY</> is + supplied, the frame consists of all rows from the start of the partition + up to the current row, plus any following rows that are equal to the + current row according to the <literal>ORDER BY</> clause. + <footnote> + <para> + The SQL standard includes options to define the window frame in + other ways, but this definition is the only one currently supported + by <productname>PostgreSQL</productname>. + </para> + </footnote> + Many window functions act only on the rows of the window frame, rather + than of the whole partition. Here is an example using <function>sum</>: + </para> + +<programlisting> +SELECT salary, sum(salary) OVER () FROM empsalary; +</programlisting> + +<screen> + salary | sum +--------+------- + 5200 | 47100 + 5000 | 47100 + 3500 | 47100 + 4800 | 47100 + 3900 | 47100 + 4200 | 47100 + 4500 | 47100 + 4800 | 47100 + 6000 | 47100 + 5200 | 47100 +(10 rows) +</screen> + + <para> + Above, since there is no <literal>ORDER BY</> in the <literal>OVER</> + clause, the window frame is the same as the partition, which for lack of + <literal>PARTITION BY</> is the whole table; in other words each sum is + taken over the whole table and so we get the same result for each output + row. But if we add an <literal>ORDER BY</> clause, we get very different + results: + </para> + +<programlisting> +SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary; +</programlisting> + +<screen> + salary | sum +--------+------- + 3500 | 3500 + 3900 | 7400 + 4200 | 11600 + 4500 | 16100 + 4800 | 25700 + 4800 | 25700 + 5000 | 30700 + 5200 | 41100 + 5200 | 41100 + 6000 | 47100 +(10 rows) +</screen> + + <para> + Here the sum is taken from the first (lowest) salary up through the + current one, including any duplicates of the current one (notice the + results for the duplicated salaries). + </para> + + <para> + Window functions are permitted only in the <literal>SELECT</literal> list + and the <literal>ORDER BY</> clause of the query. They are forbidden + elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</> + and <literal>WHERE</literal> clauses. This is because they logically + execute after the processing of those clauses. Also, window functions + execute after regular aggregate functions. This means it is valid to + include an aggregate function call in the arguments of a window function, + but not vice versa. + </para> + + <para> + If there is a need to filter or group rows after the window calculations + are performed, you can use a sub-select. For example: + +<programlisting> +SELECT depname, empno, salary, enroll_date +FROM + (SELECT depname, empno, salary, enroll_date, + rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos + FROM empsalary + ) AS ss +WHERE pos < 3; +</programlisting> + + The above query only shows the rows from the inner query having + <literal>rank</> less than <literal>3</>. + </para> + + <para> + When a query involves multiple window functions, it is possible to write + out each one with a separate <literal>OVER</> clause, but this is + duplicative and error-prone if the same windowing behavior is wanted + for several functions. Instead, each windowing behavior can be named + in a <literal>WINDOW</> clause and then referenced in <literal>OVER</>. + For example: + +<programlisting> +SELECT sum(salary) OVER w, avg(salary) OVER w + FROM empsalary + WINDOW w AS (PARTITION BY depname ORDER BY salary DESC); +</programlisting> + </para> + </sect1> + + <sect1 id="tutorial-inheritance"> <title>Inheritance</title> @@ -391,7 +627,7 @@ CREATE TABLE capitals ( <para> For example, the following query finds the names of all cities, - including state capitals, that are located at an altitude + including state capitals, that are located at an altitude over 500 feet: <programlisting> @@ -455,7 +691,7 @@ SELECT name, altitude <sect1 id="tutorial-conclusion"> <title>Conclusion</title> - + <para> <productname>PostgreSQL</productname> has many features not touched upon in this tutorial introduction, which has been diff --git a/doc/src/sgml/errcodes.sgml b/doc/src/sgml/errcodes.sgml index 574e7f5fbad..e792a74e286 100644 --- a/doc/src/sgml/errcodes.sgml +++ b/doc/src/sgml/errcodes.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.26 2008/12/28 18:53:53 tgl Exp $ --> <appendix id="errcodes-appendix"> <title><productname>PostgreSQL</productname> Error Codes</title> @@ -379,6 +379,18 @@ </row> <row> +<entry><literal>22014</literal></entry> +<entry>INVALID ARGUMENT FOR NTILE FUNCTION</entry> +<entry>invalid_argument_for_ntile_function</entry> +</row> + +<row> +<entry><literal>22016</literal></entry> +<entry>INVALID ARGUMENT FOR NTH_VALUE FUNCTION</entry> +<entry>invalid_argument_for_nth_value_function</entry> +</row> + +<row> <entry><literal>2201F</literal></entry> <entry>INVALID ARGUMENT FOR POWER FUNCTION</entry> <entry>invalid_argument_for_power_function</entry> @@ -991,6 +1003,12 @@ </row> <row> +<entry><literal>42P20</literal></entry> +<entry>WINDOWING ERROR</entry> +<entry>windowing_error</entry> +</row> + +<row> <entry><literal>42P19</literal></entry> <entry>INVALID RECURSION</entry> <entry>invalid_recursion</entry> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index de50c0e1d56..205b71e9c9e 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.463 2008/12/19 16:25:16 petere Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ --> <chapter id="functions"> <title>Functions and Operators</title> @@ -10149,6 +10149,278 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; </sect1> + <sect1 id="functions-window"> + <title>Window Functions</title> + + <indexterm zone="functions-window"> + <primary>window function</primary> + <secondary>built-in</secondary> + </indexterm> + + <para> + <firstterm>Window functions</firstterm> provide the ability to perform + calculations across sets of rows that are related to the current query + row. For information about this feature see + <xref linkend="tutorial-window"> and + <xref linkend="syntax-window-functions">. + </para> + + <para> + The built-in window functions are listed in + <xref linkend="functions-window-table">. Note that these functions + <emphasis>must</> be invoked using window function syntax; that is an + <literal>OVER</> clause is required. + </para> + + <para> + In addition to these functions, any built-in or user-defined aggregate + function can be used as a window function (see + <xref linkend="functions-aggregate"> for a list of the built-in aggregates). + Aggregate functions act as window functions only when an <literal>OVER</> + clause follows the call; otherwise they act as regular aggregates. + </para> + + <table id="functions-window-table"> + <title>General-Purpose Window Functions</title> + + <tgroup cols="3"> + <thead> + <row> + <entry>Function</entry> + <entry>Return Type</entry> + <entry>Description</entry> + </row> + </thead> + + <tbody> + <row> + <entry> + <indexterm> + <primary>row_number</primary> + </indexterm> + <function>row_number()</function> + </entry> + <entry> + <type>bigint</type> + </entry> + <entry>number of the current row within its partition, counting from 1</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>rank</primary> + </indexterm> + <function>rank()</function> + </entry> + <entry> + <type>bigint</type> + </entry> + <entry>rank of the current row with gaps; same as <function>row_number</> of its first peer</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>dense_rank</primary> + </indexterm> + <function>dense_rank()</function> + </entry> + <entry> + <type>bigint</type> + </entry> + <entry>rank of the current row without gaps; this function counts peer groups</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>percent_rank</primary> + </indexterm> + <function>percent_rank()</function> + </entry> + <entry> + <type>double precision</type> + </entry> + <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>cume_dist</primary> + </indexterm> + <function>cume_dist()</function> + </entry> + <entry> + <type>double precision</type> + </entry> + <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>ntile</primary> + </indexterm> + <function>ntile(<replaceable class="parameter">num_buckets</replaceable> <type>integer</>)</function> + </entry> + <entry> + <type>integer</type> + </entry> + <entry>integer ranging from 1 to the argument value, dividing the + partition as equally as possible</entry> + </row> + + <row> + <entry> + <indexterm> + <primary>lag</primary> + </indexterm> + <function> + lag(<replaceable class="parameter">value</replaceable> <type>any</> + [, <replaceable class="parameter">offset</replaceable> <type>integer</> + [, <replaceable class="parameter">default</replaceable> <type>any</> ]]) + </function> + </entry> + <entry> + <type>same type as <replaceable class="parameter">value</replaceable></type> + </entry> + <entry> + returns <replaceable class="parameter">value</replaceable> evaluated at + the row that is <replaceable class="parameter">offset</replaceable> + rows before the current row within the partition; if there is no such + row, instead return <replaceable class="parameter">default</replaceable>. + Both <replaceable class="parameter">offset</replaceable> and + <replaceable class="parameter">default</replaceable> are evaluated + with respect to the current row. If omitted, + <replaceable class="parameter">offset</replaceable> defaults to 1 and + <replaceable class="parameter">default</replaceable> to null + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>lead</primary> + </indexterm> + <function> + lead(<replaceable class="parameter">value</replaceable> <type>any</> + [, <replaceable class="parameter">offset</replaceable> <type>integer</> + [, <replaceable class="parameter">default</replaceable> <type>any</> ]]) + </function> + </entry> + <entry> + <type>same type as <replaceable class="parameter">value</replaceable></type> + </entry> + <entry> + returns <replaceable class="parameter">value</replaceable> evaluated at + the row that is <replaceable class="parameter">offset</replaceable> + rows after the current row within the partition; if there is no such + row, instead return <replaceable class="parameter">default</replaceable>. + Both <replaceable class="parameter">offset</replaceable> and + <replaceable class="parameter">default</replaceable> are evaluated + with respect to the current row. If omitted, + <replaceable class="parameter">offset</replaceable> defaults to 1 and + <replaceable class="parameter">default</replaceable> to null + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>first_value</primary> + </indexterm> + <function>first_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function> + </entry> + <entry> + <type>same type as <replaceable class="parameter">value</replaceable></type> + </entry> + <entry> + returns <replaceable class="parameter">value</replaceable> evaluated + at the row that is the first row of the window frame + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>last_value</primary> + </indexterm> + <function>last_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function> + </entry> + <entry> + <type>same type as <replaceable class="parameter">value</replaceable></type> + </entry> + <entry> + returns <replaceable class="parameter">value</replaceable> evaluated + at the row that is the last row of the window frame + </entry> + </row> + + <row> + <entry> + <indexterm> + <primary>nth_value</primary> + </indexterm> + <function> + nth_value(<replaceable class="parameter">value</replaceable> <type>any</>, <replaceable class="parameter">nth</replaceable> <type>integer</>) + </function> + </entry> + <entry> + <type>same type as <replaceable class="parameter">value</replaceable></type> + </entry> + <entry> + returns <replaceable class="parameter">value</replaceable> evaluated + at the row that is the <replaceable class="parameter">nth</replaceable> + row of the window frame (counting from 1); null if no such row + </entry> + </row> + </tbody> + </tgroup> + </table> + + <para> + All of the functions listed in + <xref linkend="functions-window-table"> depend on the sort ordering + specified by the <literal>ORDER BY</> clause of the associated window + definition. Rows that are not distinct in the <literal>ORDER BY</> + ordering are said to be <firstterm>peers</>; the four ranking functions + are defined so that they give the same answer for any two peer rows. + </para> + + <para> + Note that <function>first_value</>, <function>last_value</>, and + <function>nth_value</> consider only the rows within the <quote>window + frame</>, that is the rows from the start of the partition through the + last peer of the current row. This is particularly likely to give + unintuitive results for <function>last_value</>. + </para> + + <para> + When an aggregate function is used as a window function, it aggregates + over the rows within the current row's window frame. To obtain + aggregation over the whole partition, be sure to omit <literal>ORDER BY</> + from the window definition. An aggregate used with <literal>ORDER BY</> + produces a <quote>running sum</> type of behavior, which may or may not + be what's wanted. + </para> + + <note> + <para> + The SQL standard defines a <literal>RESPECT NULLS</> or + <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>, + <function>first_value</>, <function>last_value</>, and + <function>nth_value</>. This is not implemented in + <productname>PostgreSQL</productname>: the behavior is always the + same as the standard's default, namely <literal>RESPECT NULLS</>. + Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</> + option for <function>nth_value</> is not implemented: only the + default <literal>FROM FIRST</> behavior is supported. + </para> + </note> + + </sect1> <sect1 id="functions-subquery"> <title>Subquery Expressions</title> diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index 283dd0a73dd..f1db64b273a 100644 --- a/doc/src/sgml/queries.sgml +++ b/doc/src/sgml/queries.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.50 2008/10/14 00:41:34 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ --> <chapter id="queries"> <title>Queries</title> @@ -949,6 +949,57 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit 5000. Note that the aggregate expressions do not necessarily need to be the same in all parts of the query. </para> + + <para> + If a query contains aggregate function calls, but no <literal>GROUP BY</> + clause, grouping still occurs: the result is a single group row (or + perhaps no rows at all, if the single row is then eliminated by + <literal>HAVING</>). + The same is true if it contains a <literal>HAVING</> clause, even + without any aggregate function calls or <literal>GROUP BY</> clause. + </para> + </sect2> + + <sect2 id="queries-window"> + <title>Window Function Processing</> + + <indexterm zone="queries-window"> + <primary>window function</primary> + <secondary>order of execution</> + </indexterm> + + <para> + If the query contains any window functions (see + <xref linkend="tutorial-window"> and + <xref linkend="syntax-window-functions">), these functions are evaluated + after any grouping, aggregation, and <literal>HAVING</> filtering is + performed. That is, if the query uses any aggregates, <literal>GROUP + BY</>, or <literal>HAVING</>, then the rows seen by the window functions + are the group rows instead of the original table rows from + <literal>FROM</>/<literal>WHERE</>. + </para> + + <para> + When multiple window functions are used, all the window functions having + syntactically equivalent <literal>PARTITION BY</> and <literal>ORDER BY</> + clauses in their window definitions are guaranteed to be evaluated in a + single pass over the data. Therefore they will see the same sort ordering, + even if the <literal>ORDER BY</> does not uniquely determine an ordering. + However, no guarantees are made about the evaluation of functions having + different <literal>PARTITION BY</> or <literal>ORDER BY</> specifications. + (In such cases a sort step is typically required between the passes of + window function evaluations, and the sort is not guaranteed to preserve + ordering of rows that its <literal>ORDER BY</> sees as equivalent.) + </para> + + <para> + Currently, use of window functions always forces sorting, and so the + query output will be ordered according to one or another of the window + functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses. + It is not recommendable to rely on this, however. Use an explicit + top-level <literal>ORDER BY</> clause if you want to be sure the + results are sorted in a particular way. + </para> </sect2> </sect1> diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml index 442f9ad0068..ffc641b03ad 100644 --- a/doc/src/sgml/query.sgml +++ b/doc/src/sgml/query.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.50 2007/02/01 00:28:17 momjian Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ --> <chapter id="tutorial-sql"> <title>The <acronym>SQL</acronym> Language</title> @@ -621,7 +621,7 @@ SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high, San Francisco | 43 | 57 | San Francisco | 46 | 50 Hayward | 37 | 54 | San Francisco | 46 | 50 (2 rows) -</programlisting> +</programlisting> Here we have relabeled the weather table as <literal>W1</> and <literal>W2</> to be able to distinguish the left and right side @@ -651,9 +651,9 @@ SELECT * <indexterm><primary>min</primary></indexterm> <indexterm><primary>sum</primary></indexterm> - Like most other relational database products, + Like most other relational database products, <productname>PostgreSQL</productname> supports - aggregate functions. + <firstterm>aggregate functions</>. An aggregate function computes a single result from multiple input rows. For example, there are aggregates to compute the <function>count</function>, <function>sum</function>, @@ -815,7 +815,7 @@ SELECT city, max(temp_lo) <para> You can update existing rows using the - <command>UPDATE</command> command. + <command>UPDATE</command> command. Suppose you discover the temperature readings are all off by 2 degrees after November 28. You can correct the data as follows: diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml index 814a6708f00..c9a386f24f3 100644 --- a/doc/src/sgml/ref/select.sgml +++ b/doc/src/sgml/ref/select.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.112 2008/12/01 09:38:08 petere Exp $ +$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.113 2008/12/28 18:53:54 tgl Exp $ PostgreSQL documentation --> @@ -39,6 +39,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac [ WHERE <replaceable class="parameter">condition</replaceable> ] [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ] [ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ] + [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ] [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ] [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ] [ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ] @@ -566,6 +567,67 @@ HAVING <replaceable class="parameter">condition</replaceable> </para> </refsect2> + <refsect2 id="SQL-WINDOW"> + <title id="sql-window-title"><literal>WINDOW</literal> Clause</title> + + <para> + The optional <literal>WINDOW</literal> clause has the general form +<synopsis> +WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] +</synopsis> + where <replaceable class="parameter">window_name</replaceable> is + a name that can be referenced from subsequent window definitions or + <literal>OVER</> clauses, and + <replaceable class="parameter">window_definition</replaceable> is +<synopsis> +[ <replaceable class="parameter">existing_window_name</replaceable> ] +[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ] +[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ] +</synopsis> + The elements of the <literal>PARTITION BY</> list are interpreted in + the same fashion as elements of a + <xref linkend="sql-groupby" endterm="sql-groupby-title">, and + the elements of the <literal>ORDER BY</> list are interpreted in the + same fashion as elements of an + <xref linkend="sql-orderby" endterm="sql-orderby-title">. + The only difference is that these expressions can contain aggregate + function calls, which are not allowed in a regular <literal>GROUP BY</> + clause. They are allowed here because windowing occurs after grouping + and aggregation. + </para> + + <para> + If an <replaceable class="parameter">existing_window_name</replaceable> + is specified it must refer to an earlier entry in the <literal>WINDOW</> + list; the new window copies its partitioning clause from that entry, + as well as its ordering clause if any. In this case the new window cannot + specify its own <literal>PARTITION BY</> clause, and it can specify + <literal>ORDER BY</> only if the copied window does not have one. + </para> + + <para> + The purpose of a <literal>WINDOW</literal> clause is to specify the + behavior of <firstterm>window functions</> appearing in the query's + <xref linkend="sql-select-list" endterm="sql-select-list-title"> or + <xref linkend="sql-orderby" endterm="sql-orderby-title">. These functions + can reference the <literal>WINDOW</literal> clause entries by name + in their <literal>OVER</> clauses. A <literal>WINDOW</literal> clause + entry does not have to be referenced anywhere, however; if it is not + used in the query it is simply ignored. It is possible to use window + functions without any <literal>WINDOW</literal> clause at all, since + a window function call can specify its window definition directly in + its <literal>OVER</> clause. However, the <literal>WINDOW</literal> + clause saves typing when the same window definition is needed for more + than one window function. + </para> + + <para> + Window functions are described in detail in + <xref linkend="tutorial-window"> and + <xref linkend="syntax-window-functions">. + </para> + </refsect2> + <refsect2 id="sql-select-list"> <title id="sql-select-list-title"><command>SELECT</command> List</title> @@ -922,7 +984,7 @@ FETCH { FIRST | NEXT } [ <replaceable class="parameter">count</replaceable> ] { constants for the offset or fetch count, parentheses will be necessary in most cases. If the fetch count is omitted, it defaults to 1. - </para> + </para> <para> When using <literal>LIMIT</>, it is a good idea to use an @@ -1388,6 +1450,19 @@ SELECT distributors.* WHERE distributors.name = 'Westward'; </refsect2> <refsect2> + <title><literal>WINDOW</literal> Clause Restrictions</title> + + <para> + The SQL standard provides for an optional <quote>framing clause</>, + introduced by the key word <literal>RANGE</> or <literal>ROWS</>, + in window definitions. <productname>PostgreSQL</productname> does + not yet implement framing clauses, and always follows the + default framing behavior, which is equivalent to the framing clause + <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>. + </para> + </refsect2> + + <refsect2> <title><literal>LIMIT</literal> and <literal>OFFSET</literal></title> <para> diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml index 038ae1b333c..057bfb2a9d7 100644 --- a/doc/src/sgml/ref/select_into.sgml +++ b/doc/src/sgml/ref/select_into.sgml @@ -1,5 +1,5 @@ <!-- -$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.43 2008/11/14 10:22:47 petere Exp $ +$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.44 2008/12/28 18:53:54 tgl Exp $ PostgreSQL documentation --> @@ -29,6 +29,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac [ WHERE <replaceable class="parameter">condition</replaceable> ] [ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ] [ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ] + [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ] [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ] [ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ] [ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ] diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index cca44794340..9d0833c2035 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.126 2008/12/09 20:52:03 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ --> <chapter id="sql-syntax"> <title>SQL Syntax</title> @@ -1203,6 +1203,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4; <listitem> <para> + A window function call. + </para> + </listitem> + + <listitem> + <para> A type cast. </para> </listitem> @@ -1445,7 +1451,7 @@ $1.somecolumn enclosed in parentheses: <synopsis> -<replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> ) +<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> ) </synopsis> </para> @@ -1480,7 +1486,7 @@ sqrt(2) <synopsis> <replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] ) <replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] ) -<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] ) +<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>) <replaceable>aggregate_name</replaceable> ( * ) </synopsis> @@ -1488,7 +1494,7 @@ sqrt(2) defined aggregate (possibly qualified with a schema name), and <replaceable>expression</replaceable> is any value expression that does not itself contain an aggregate - expression. + expression or a window function call. </para> <para> @@ -1550,6 +1556,89 @@ sqrt(2) </note> </sect2> + <sect2 id="syntax-window-functions"> + <title>Window Function Calls</title> + + <indexterm zone="syntax-window-functions"> + <primary>window function</primary> + <secondary>invocation</secondary> + </indexterm> + + <indexterm zone="syntax-window-functions"> + <primary>OVER clause</primary> + </indexterm> + + <para> + A <firstterm>window function call</firstterm> represents the application + of an aggregate-like function over some portion of the rows selected + by a query. Unlike regular aggregate function calls, this is not tied + to grouping of the selected rows into a single output row — each + row remains separate in the query output. However the window function + is able to scan all the rows that would be part of the current row's + group according to the grouping specification (<literal>PARTITION BY</> + list) of the window function call. + The syntax of a window function call is one of the following: + +<synopsis> +<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> ) +<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable> +<replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> ) +<replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable> +</synopsis> + where <replaceable class="parameter">window_definition</replaceable> + has the syntax +<synopsis> +[ <replaceable class="parameter">window_name</replaceable> ] +[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ] +[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ] +</synopsis> + + Here, <replaceable>expression</replaceable> represents any value + expression that does not itself contain window function calls. + The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have + essentially the same syntax and semantics as <literal>GROUP BY</> + and <literal>ORDER BY</> clauses of the whole query. + <replaceable>window_name</replaceable> is a reference to a named window + specification defined in the query's <literal>WINDOW</literal> clause. + Named window specifications are usually referenced with just + <literal>OVER</> <replaceable>window_name</replaceable>, but it is + also possible to write a window name inside the parentheses and then + optionally override its ordering clause with <literal>ORDER BY</>. + This latter syntax follows the same rules as modifying an existing + window name within the <literal>WINDOW</literal> clause; see the + <xref linkend="sql-select" endterm="sql-select-title"> reference + page for details. + </para> + + <para> + The built-in window functions are described in <xref + linkend="functions-window-table">. Also, any built-in or + user-defined aggregate function can be used as a window function. + Currently, there is no provision for user-defined window functions + other than aggregates. + </para> + + <para> + The syntaxes using <literal>*</> are used for calling parameter-less + aggregate functions as window functions, for example + <literal>count(*) OVER (PARTITION BY x ORDER BY y)</>. + <literal>*</> is customarily not used for non-aggregate window functions. + Aggregate window functions, unlike normal aggregate functions, do not + allow <literal>DISTINCT</> to be used within the function argument list. + </para> + + <para> + Window function calls are permitted only in the <literal>SELECT</literal> + list and the <literal>ORDER BY</> clause of the query. + </para> + + <para> + More information about window functions can be found in + <xref linkend="tutorial-window"> and + <xref linkend="queries-window">. + </para> + </sect2> + <sect2 id="sql-syntax-type-casts"> <title>Type Casts</title> diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml index 3c4ce19258e..b223888f9ed 100644 --- a/doc/src/sgml/xaggr.sgml +++ b/doc/src/sgml/xaggr.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.36 2008/11/20 21:10:44 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.37 2008/12/28 18:53:54 tgl Exp $ --> <sect1 id="xaggr"> <title>User-Defined Aggregates</title> @@ -167,10 +167,13 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype) <para> A function written in C can detect that it is being called as an aggregate transition or final function by seeing if it was passed - an <structname>AggState</> node as the function call <quote>context</>, + an <structname>AggState</> or <structname>WindowAggState</> node + as the function call <quote>context</>, for example by: <programlisting> - if (fcinfo->context && IsA(fcinfo->context, AggState)) + if (fcinfo->context && + (IsA(fcinfo->context, AggState) || + IsA(fcinfo->context, WindowAggState))) </programlisting> One reason for checking this is that when it is true, the first input must be a temporary transition value and can therefore safely be modified |