aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/advanced.sgml250
-rw-r--r--doc/src/sgml/errcodes.sgml20
-rw-r--r--doc/src/sgml/func.sgml274
-rw-r--r--doc/src/sgml/queries.sgml53
-rw-r--r--doc/src/sgml/query.sgml10
-rw-r--r--doc/src/sgml/ref/select.sgml79
-rw-r--r--doc/src/sgml/ref/select_into.sgml3
-rw-r--r--doc/src/sgml/syntax.sgml97
-rw-r--r--doc/src/sgml/xaggr.sgml9
9 files changed, 770 insertions, 25 deletions
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 2ecb2da5c56..ce8ef535dba 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.55 2008/12/28 18:53:53 tgl Exp $ -->
<chapter id="tutorial-advanced">
<title>Advanced Features</title>
@@ -240,7 +240,7 @@ COMMIT;
<para>
<productname>PostgreSQL</> actually treats every SQL statement as being
executed within a transaction. If you do not issue a <command>BEGIN</>
- command,
+ command,
then each individual statement has an implicit <command>BEGIN</> and
(if successful) <command>COMMIT</> wrapped around it. A group of
statements surrounded by <command>BEGIN</> and <command>COMMIT</>
@@ -265,7 +265,7 @@ COMMIT;
with <command>ROLLBACK TO</>. All the transaction's database changes
between defining the savepoint and rolling back to it are discarded, but
changes earlier than the savepoint are kept.
- </para>
+ </para>
<para>
After rolling back to a savepoint, it continues to be defined, so you can
@@ -274,7 +274,7 @@ COMMIT;
system can free some resources. Keep in mind that either releasing or
rolling back to a savepoint
will automatically release all savepoints that were defined after it.
- </para>
+ </para>
<para>
All this is happening within the transaction block, so none of it
@@ -282,7 +282,7 @@ COMMIT;
transaction block, the committed actions become visible as a unit
to other sessions, while the rolled-back actions never become visible
at all.
- </para>
+ </para>
<para>
Remembering the bank database, suppose we debit $100.00 from Alice's
@@ -317,6 +317,242 @@ COMMIT;
</sect1>
+ <sect1 id="tutorial-window">
+ <title id="tutorial-window-title">Window Functions</title>
+
+ <indexterm zone="tutorial-window">
+ <primary>window function</primary>
+ </indexterm>
+
+ <para>
+ A <firstterm>window function</> performs a calculation across a set of
+ table rows that are somehow related to the current row. This is comparable
+ to the type of calculation that can be done with an aggregate function.
+ But unlike regular aggregate functions, use of a window function does not
+ cause rows to become grouped into a single output row &mdash; the
+ rows retain their separate identities. Behind the scenes, the window
+ function is able to access more than just the current row of the query
+ result.
+ </para>
+
+ <para>
+ Here is an example that shows how to compare each employee's salary
+ with the average salary in his or her department:
+
+<programlisting>
+SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
+</programlisting>
+
+<screen>
+ depname | empno | salary | avg
+-----------+-------+--------+-----------------------
+ develop | 11 | 5200 | 5020.0000000000000000
+ develop | 7 | 4200 | 5020.0000000000000000
+ develop | 9 | 4500 | 5020.0000000000000000
+ develop | 8 | 6000 | 5020.0000000000000000
+ develop | 10 | 5200 | 5020.0000000000000000
+ personnel | 5 | 3500 | 3700.0000000000000000
+ personnel | 2 | 3900 | 3700.0000000000000000
+ sales | 3 | 4800 | 4866.6666666666666667
+ sales | 1 | 5000 | 4866.6666666666666667
+ sales | 4 | 4800 | 4866.6666666666666667
+(10 rows)
+</screen>
+
+ The first three output columns come directly from the table
+ <structname>empsalary</>, and there is one output row for each row in the
+ table. The fourth column represents an average taken across all the table
+ rows that have the same <structfield>depname</> value as the current row.
+ (This actually is the same function as the regular <function>avg</>
+ aggregate function, but the <literal>OVER</> clause causes it to be
+ treated as a window function and computed across an appropriate set of
+ rows.)
+ </para>
+
+ <para>
+ A window function call always contains an <literal>OVER</> clause
+ following the window function's name and argument(s). This is what
+ syntactically distinguishes it from a regular function or aggregate
+ function. The <literal>OVER</> clause determines exactly how the
+ rows of the query are split up for processing by the window function.
+ The <literal>PARTITION BY</> list within <literal>OVER</> specifies
+ dividing the rows into groups, or partitions, that share the same
+ values of the <literal>PARTITION BY</> expression(s). For each row,
+ the window function is computed across the rows that fall into the
+ same partition as the current row.
+ </para>
+
+ <para>
+ Although <function>avg</> will produce the same result no matter
+ what order it processes the partition's rows in, this is not true of all
+ window functions. When needed, you can control that order using
+ <literal>ORDER BY</> within <literal>OVER</>. Here is an example:
+
+<programlisting>
+SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary DESC) FROM empsalary;
+</programlisting>
+
+<screen>
+ depname | empno | salary | rank
+-----------+-------+--------+------
+ develop | 8 | 6000 | 1
+ develop | 10 | 5200 | 2
+ develop | 11 | 5200 | 2
+ develop | 9 | 4500 | 4
+ develop | 7 | 4200 | 5
+ personnel | 2 | 3900 | 1
+ personnel | 5 | 3500 | 2
+ sales | 1 | 5000 | 1
+ sales | 4 | 4800 | 2
+ sales | 3 | 4800 | 2
+(10 rows)
+</screen>
+
+ As shown here, the <function>rank</> function produces a numerical rank
+ within the current row's partition for each distinct <literal>ORDER BY</>
+ value, in the order defined by the <literal>ORDER BY</> clause.
+ <function>rank</> needs no explicit parameter, because its behavior
+ is entirely determined by the <literal>OVER</> clause.
+ </para>
+
+ <para>
+ The rows considered by a window function are those of the <quote>virtual
+ table</> produced by the query's <literal>FROM</> clause as filtered by its
+ <literal>WHERE</>, <literal>GROUP BY</>, and <literal>HAVING</> clauses
+ if any. For example, a row removed because it does not meet the
+ <literal>WHERE</> condition is not seen by any window function.
+ A query can contain multiple window functions that slice up the data
+ in different ways by means of different <literal>OVER</> clauses, but
+ they all act on the same collection of rows defined by this virtual table.
+ </para>
+
+ <para>
+ We already saw that <literal>ORDER BY</> can be omitted if the ordering
+ of rows is not important. It is also possible to omit <literal>PARTITION
+ BY</>, in which case the window function is computed over all rows of the
+ virtual table; that is, there is one partition containing all the rows.
+ </para>
+
+ <para>
+ There is another important concept associated with window functions:
+ for each row, there is a set of rows within its partition called its
+ <firstterm>window frame</>. When <literal>ORDER BY</> is omitted the
+ frame is always the same as the partition. If <literal>ORDER BY</> is
+ supplied, the frame consists of all rows from the start of the partition
+ up to the current row, plus any following rows that are equal to the
+ current row according to the <literal>ORDER BY</> clause.
+ <footnote>
+ <para>
+ The SQL standard includes options to define the window frame in
+ other ways, but this definition is the only one currently supported
+ by <productname>PostgreSQL</productname>.
+ </para>
+ </footnote>
+ Many window functions act only on the rows of the window frame, rather
+ than of the whole partition. Here is an example using <function>sum</>:
+ </para>
+
+<programlisting>
+SELECT salary, sum(salary) OVER () FROM empsalary;
+</programlisting>
+
+<screen>
+ salary | sum
+--------+-------
+ 5200 | 47100
+ 5000 | 47100
+ 3500 | 47100
+ 4800 | 47100
+ 3900 | 47100
+ 4200 | 47100
+ 4500 | 47100
+ 4800 | 47100
+ 6000 | 47100
+ 5200 | 47100
+(10 rows)
+</screen>
+
+ <para>
+ Above, since there is no <literal>ORDER BY</> in the <literal>OVER</>
+ clause, the window frame is the same as the partition, which for lack of
+ <literal>PARTITION BY</> is the whole table; in other words each sum is
+ taken over the whole table and so we get the same result for each output
+ row. But if we add an <literal>ORDER BY</> clause, we get very different
+ results:
+ </para>
+
+<programlisting>
+SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
+</programlisting>
+
+<screen>
+ salary | sum
+--------+-------
+ 3500 | 3500
+ 3900 | 7400
+ 4200 | 11600
+ 4500 | 16100
+ 4800 | 25700
+ 4800 | 25700
+ 5000 | 30700
+ 5200 | 41100
+ 5200 | 41100
+ 6000 | 47100
+(10 rows)
+</screen>
+
+ <para>
+ Here the sum is taken from the first (lowest) salary up through the
+ current one, including any duplicates of the current one (notice the
+ results for the duplicated salaries).
+ </para>
+
+ <para>
+ Window functions are permitted only in the <literal>SELECT</literal> list
+ and the <literal>ORDER BY</> clause of the query. They are forbidden
+ elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
+ and <literal>WHERE</literal> clauses. This is because they logically
+ execute after the processing of those clauses. Also, window functions
+ execute after regular aggregate functions. This means it is valid to
+ include an aggregate function call in the arguments of a window function,
+ but not vice versa.
+ </para>
+
+ <para>
+ If there is a need to filter or group rows after the window calculations
+ are performed, you can use a sub-select. For example:
+
+<programlisting>
+SELECT depname, empno, salary, enroll_date
+FROM
+ (SELECT depname, empno, salary, enroll_date,
+ rank() OVER (PARTITION BY depname ORDER BY salary DESC, empno) AS pos
+ FROM empsalary
+ ) AS ss
+WHERE pos < 3;
+</programlisting>
+
+ The above query only shows the rows from the inner query having
+ <literal>rank</> less than <literal>3</>.
+ </para>
+
+ <para>
+ When a query involves multiple window functions, it is possible to write
+ out each one with a separate <literal>OVER</> clause, but this is
+ duplicative and error-prone if the same windowing behavior is wanted
+ for several functions. Instead, each windowing behavior can be named
+ in a <literal>WINDOW</> clause and then referenced in <literal>OVER</>.
+ For example:
+
+<programlisting>
+SELECT sum(salary) OVER w, avg(salary) OVER w
+ FROM empsalary
+ WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);
+</programlisting>
+ </para>
+ </sect1>
+
+
<sect1 id="tutorial-inheritance">
<title>Inheritance</title>
@@ -391,7 +627,7 @@ CREATE TABLE capitals (
<para>
For example, the following query finds the names of all cities,
- including state capitals, that are located at an altitude
+ including state capitals, that are located at an altitude
over 500 feet:
<programlisting>
@@ -455,7 +691,7 @@ SELECT name, altitude
<sect1 id="tutorial-conclusion">
<title>Conclusion</title>
-
+
<para>
<productname>PostgreSQL</productname> has many features not
touched upon in this tutorial introduction, which has been
diff --git a/doc/src/sgml/errcodes.sgml b/doc/src/sgml/errcodes.sgml
index 574e7f5fbad..e792a74e286 100644
--- a/doc/src/sgml/errcodes.sgml
+++ b/doc/src/sgml/errcodes.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.25 2008/10/04 21:56:52 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/errcodes.sgml,v 1.26 2008/12/28 18:53:53 tgl Exp $ -->
<appendix id="errcodes-appendix">
<title><productname>PostgreSQL</productname> Error Codes</title>
@@ -379,6 +379,18 @@
</row>
<row>
+<entry><literal>22014</literal></entry>
+<entry>INVALID ARGUMENT FOR NTILE FUNCTION</entry>
+<entry>invalid_argument_for_ntile_function</entry>
+</row>
+
+<row>
+<entry><literal>22016</literal></entry>
+<entry>INVALID ARGUMENT FOR NTH_VALUE FUNCTION</entry>
+<entry>invalid_argument_for_nth_value_function</entry>
+</row>
+
+<row>
<entry><literal>2201F</literal></entry>
<entry>INVALID ARGUMENT FOR POWER FUNCTION</entry>
<entry>invalid_argument_for_power_function</entry>
@@ -991,6 +1003,12 @@
</row>
<row>
+<entry><literal>42P20</literal></entry>
+<entry>WINDOWING ERROR</entry>
+<entry>windowing_error</entry>
+</row>
+
+<row>
<entry><literal>42P19</literal></entry>
<entry>INVALID RECURSION</entry>
<entry>invalid_recursion</entry>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index de50c0e1d56..205b71e9c9e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.463 2008/12/19 16:25:16 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.464 2008/12/28 18:53:53 tgl Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
@@ -10149,6 +10149,278 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
</sect1>
+ <sect1 id="functions-window">
+ <title>Window Functions</title>
+
+ <indexterm zone="functions-window">
+ <primary>window function</primary>
+ <secondary>built-in</secondary>
+ </indexterm>
+
+ <para>
+ <firstterm>Window functions</firstterm> provide the ability to perform
+ calculations across sets of rows that are related to the current query
+ row. For information about this feature see
+ <xref linkend="tutorial-window"> and
+ <xref linkend="syntax-window-functions">.
+ </para>
+
+ <para>
+ The built-in window functions are listed in
+ <xref linkend="functions-window-table">. Note that these functions
+ <emphasis>must</> be invoked using window function syntax; that is an
+ <literal>OVER</> clause is required.
+ </para>
+
+ <para>
+ In addition to these functions, any built-in or user-defined aggregate
+ function can be used as a window function (see
+ <xref linkend="functions-aggregate"> for a list of the built-in aggregates).
+ Aggregate functions act as window functions only when an <literal>OVER</>
+ clause follows the call; otherwise they act as regular aggregates.
+ </para>
+
+ <table id="functions-window-table">
+ <title>General-Purpose Window Functions</title>
+
+ <tgroup cols="3">
+ <thead>
+ <row>
+ <entry>Function</entry>
+ <entry>Return Type</entry>
+ <entry>Description</entry>
+ </row>
+ </thead>
+
+ <tbody>
+ <row>
+ <entry>
+ <indexterm>
+ <primary>row_number</primary>
+ </indexterm>
+ <function>row_number()</function>
+ </entry>
+ <entry>
+ <type>bigint</type>
+ </entry>
+ <entry>number of the current row within its partition, counting from 1</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>rank</primary>
+ </indexterm>
+ <function>rank()</function>
+ </entry>
+ <entry>
+ <type>bigint</type>
+ </entry>
+ <entry>rank of the current row with gaps; same as <function>row_number</> of its first peer</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>dense_rank</primary>
+ </indexterm>
+ <function>dense_rank()</function>
+ </entry>
+ <entry>
+ <type>bigint</type>
+ </entry>
+ <entry>rank of the current row without gaps; this function counts peer groups</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>percent_rank</primary>
+ </indexterm>
+ <function>percent_rank()</function>
+ </entry>
+ <entry>
+ <type>double precision</type>
+ </entry>
+ <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>cume_dist</primary>
+ </indexterm>
+ <function>cume_dist()</function>
+ </entry>
+ <entry>
+ <type>double precision</type>
+ </entry>
+ <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>ntile</primary>
+ </indexterm>
+ <function>ntile(<replaceable class="parameter">num_buckets</replaceable> <type>integer</>)</function>
+ </entry>
+ <entry>
+ <type>integer</type>
+ </entry>
+ <entry>integer ranging from 1 to the argument value, dividing the
+ partition as equally as possible</entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>lag</primary>
+ </indexterm>
+ <function>
+ lag(<replaceable class="parameter">value</replaceable> <type>any</>
+ [, <replaceable class="parameter">offset</replaceable> <type>integer</>
+ [, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ </function>
+ </entry>
+ <entry>
+ <type>same type as <replaceable class="parameter">value</replaceable></type>
+ </entry>
+ <entry>
+ returns <replaceable class="parameter">value</replaceable> evaluated at
+ the row that is <replaceable class="parameter">offset</replaceable>
+ rows before the current row within the partition; if there is no such
+ row, instead return <replaceable class="parameter">default</replaceable>.
+ Both <replaceable class="parameter">offset</replaceable> and
+ <replaceable class="parameter">default</replaceable> are evaluated
+ with respect to the current row. If omitted,
+ <replaceable class="parameter">offset</replaceable> defaults to 1 and
+ <replaceable class="parameter">default</replaceable> to null
+ </entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>lead</primary>
+ </indexterm>
+ <function>
+ lead(<replaceable class="parameter">value</replaceable> <type>any</>
+ [, <replaceable class="parameter">offset</replaceable> <type>integer</>
+ [, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ </function>
+ </entry>
+ <entry>
+ <type>same type as <replaceable class="parameter">value</replaceable></type>
+ </entry>
+ <entry>
+ returns <replaceable class="parameter">value</replaceable> evaluated at
+ the row that is <replaceable class="parameter">offset</replaceable>
+ rows after the current row within the partition; if there is no such
+ row, instead return <replaceable class="parameter">default</replaceable>.
+ Both <replaceable class="parameter">offset</replaceable> and
+ <replaceable class="parameter">default</replaceable> are evaluated
+ with respect to the current row. If omitted,
+ <replaceable class="parameter">offset</replaceable> defaults to 1 and
+ <replaceable class="parameter">default</replaceable> to null
+ </entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>first_value</primary>
+ </indexterm>
+ <function>first_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
+ </entry>
+ <entry>
+ <type>same type as <replaceable class="parameter">value</replaceable></type>
+ </entry>
+ <entry>
+ returns <replaceable class="parameter">value</replaceable> evaluated
+ at the row that is the first row of the window frame
+ </entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>last_value</primary>
+ </indexterm>
+ <function>last_value(<replaceable class="parameter">value</replaceable> <type>any</>)</function>
+ </entry>
+ <entry>
+ <type>same type as <replaceable class="parameter">value</replaceable></type>
+ </entry>
+ <entry>
+ returns <replaceable class="parameter">value</replaceable> evaluated
+ at the row that is the last row of the window frame
+ </entry>
+ </row>
+
+ <row>
+ <entry>
+ <indexterm>
+ <primary>nth_value</primary>
+ </indexterm>
+ <function>
+ nth_value(<replaceable class="parameter">value</replaceable> <type>any</>, <replaceable class="parameter">nth</replaceable> <type>integer</>)
+ </function>
+ </entry>
+ <entry>
+ <type>same type as <replaceable class="parameter">value</replaceable></type>
+ </entry>
+ <entry>
+ returns <replaceable class="parameter">value</replaceable> evaluated
+ at the row that is the <replaceable class="parameter">nth</replaceable>
+ row of the window frame (counting from 1); null if no such row
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ All of the functions listed in
+ <xref linkend="functions-window-table"> depend on the sort ordering
+ specified by the <literal>ORDER BY</> clause of the associated window
+ definition. Rows that are not distinct in the <literal>ORDER BY</>
+ ordering are said to be <firstterm>peers</>; the four ranking functions
+ are defined so that they give the same answer for any two peer rows.
+ </para>
+
+ <para>
+ Note that <function>first_value</>, <function>last_value</>, and
+ <function>nth_value</> consider only the rows within the <quote>window
+ frame</>, that is the rows from the start of the partition through the
+ last peer of the current row. This is particularly likely to give
+ unintuitive results for <function>last_value</>.
+ </para>
+
+ <para>
+ When an aggregate function is used as a window function, it aggregates
+ over the rows within the current row's window frame. To obtain
+ aggregation over the whole partition, be sure to omit <literal>ORDER BY</>
+ from the window definition. An aggregate used with <literal>ORDER BY</>
+ produces a <quote>running sum</> type of behavior, which may or may not
+ be what's wanted.
+ </para>
+
+ <note>
+ <para>
+ The SQL standard defines a <literal>RESPECT NULLS</> or
+ <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
+ <function>first_value</>, <function>last_value</>, and
+ <function>nth_value</>. This is not implemented in
+ <productname>PostgreSQL</productname>: the behavior is always the
+ same as the standard's default, namely <literal>RESPECT NULLS</>.
+ Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
+ option for <function>nth_value</> is not implemented: only the
+ default <literal>FROM FIRST</> behavior is supported.
+ </para>
+ </note>
+
+ </sect1>
<sect1 id="functions-subquery">
<title>Subquery Expressions</title>
diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml
index 283dd0a73dd..f1db64b273a 100644
--- a/doc/src/sgml/queries.sgml
+++ b/doc/src/sgml/queries.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.50 2008/10/14 00:41:34 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="queries">
<title>Queries</title>
@@ -949,6 +949,57 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
5000. Note that the aggregate expressions do not necessarily need
to be the same in all parts of the query.
</para>
+
+ <para>
+ If a query contains aggregate function calls, but no <literal>GROUP BY</>
+ clause, grouping still occurs: the result is a single group row (or
+ perhaps no rows at all, if the single row is then eliminated by
+ <literal>HAVING</>).
+ The same is true if it contains a <literal>HAVING</> clause, even
+ without any aggregate function calls or <literal>GROUP BY</> clause.
+ </para>
+ </sect2>
+
+ <sect2 id="queries-window">
+ <title>Window Function Processing</>
+
+ <indexterm zone="queries-window">
+ <primary>window function</primary>
+ <secondary>order of execution</>
+ </indexterm>
+
+ <para>
+ If the query contains any window functions (see
+ <xref linkend="tutorial-window"> and
+ <xref linkend="syntax-window-functions">), these functions are evaluated
+ after any grouping, aggregation, and <literal>HAVING</> filtering is
+ performed. That is, if the query uses any aggregates, <literal>GROUP
+ BY</>, or <literal>HAVING</>, then the rows seen by the window functions
+ are the group rows instead of the original table rows from
+ <literal>FROM</>/<literal>WHERE</>.
+ </para>
+
+ <para>
+ When multiple window functions are used, all the window functions having
+ syntactically equivalent <literal>PARTITION BY</> and <literal>ORDER BY</>
+ clauses in their window definitions are guaranteed to be evaluated in a
+ single pass over the data. Therefore they will see the same sort ordering,
+ even if the <literal>ORDER BY</> does not uniquely determine an ordering.
+ However, no guarantees are made about the evaluation of functions having
+ different <literal>PARTITION BY</> or <literal>ORDER BY</> specifications.
+ (In such cases a sort step is typically required between the passes of
+ window function evaluations, and the sort is not guaranteed to preserve
+ ordering of rows that its <literal>ORDER BY</> sees as equivalent.)
+ </para>
+
+ <para>
+ Currently, use of window functions always forces sorting, and so the
+ query output will be ordered according to one or another of the window
+ functions' <literal>PARTITION BY</>/<literal>ORDER BY</> clauses.
+ It is not recommendable to rely on this, however. Use an explicit
+ top-level <literal>ORDER BY</> clause if you want to be sure the
+ results are sorted in a particular way.
+ </para>
</sect2>
</sect1>
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index 442f9ad0068..ffc641b03ad 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.50 2007/02/01 00:28:17 momjian Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="tutorial-sql">
<title>The <acronym>SQL</acronym> Language</title>
@@ -621,7 +621,7 @@ SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
San Francisco | 43 | 57 | San Francisco | 46 | 50
Hayward | 37 | 54 | San Francisco | 46 | 50
(2 rows)
-</programlisting>
+</programlisting>
Here we have relabeled the weather table as <literal>W1</> and
<literal>W2</> to be able to distinguish the left and right side
@@ -651,9 +651,9 @@ SELECT *
<indexterm><primary>min</primary></indexterm>
<indexterm><primary>sum</primary></indexterm>
- Like most other relational database products,
+ Like most other relational database products,
<productname>PostgreSQL</productname> supports
- aggregate functions.
+ <firstterm>aggregate functions</>.
An aggregate function computes a single result from multiple input rows.
For example, there are aggregates to compute the
<function>count</function>, <function>sum</function>,
@@ -815,7 +815,7 @@ SELECT city, max(temp_lo)
<para>
You can update existing rows using the
- <command>UPDATE</command> command.
+ <command>UPDATE</command> command.
Suppose you discover the temperature readings are
all off by 2 degrees after November 28. You can correct the
data as follows:
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 814a6708f00..c9a386f24f3 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.112 2008/12/01 09:38:08 petere Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.113 2008/12/28 18:53:54 tgl Exp $
PostgreSQL documentation
-->
@@ -39,6 +39,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
+ [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
@@ -566,6 +567,67 @@ HAVING <replaceable class="parameter">condition</replaceable>
</para>
</refsect2>
+ <refsect2 id="SQL-WINDOW">
+ <title id="sql-window-title"><literal>WINDOW</literal> Clause</title>
+
+ <para>
+ The optional <literal>WINDOW</literal> clause has the general form
+<synopsis>
+WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...]
+</synopsis>
+ where <replaceable class="parameter">window_name</replaceable> is
+ a name that can be referenced from subsequent window definitions or
+ <literal>OVER</> clauses, and
+ <replaceable class="parameter">window_definition</replaceable> is
+<synopsis>
+[ <replaceable class="parameter">existing_window_name</replaceable> ]
+[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
+[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
+</synopsis>
+ The elements of the <literal>PARTITION BY</> list are interpreted in
+ the same fashion as elements of a
+ <xref linkend="sql-groupby" endterm="sql-groupby-title">, and
+ the elements of the <literal>ORDER BY</> list are interpreted in the
+ same fashion as elements of an
+ <xref linkend="sql-orderby" endterm="sql-orderby-title">.
+ The only difference is that these expressions can contain aggregate
+ function calls, which are not allowed in a regular <literal>GROUP BY</>
+ clause. They are allowed here because windowing occurs after grouping
+ and aggregation.
+ </para>
+
+ <para>
+ If an <replaceable class="parameter">existing_window_name</replaceable>
+ is specified it must refer to an earlier entry in the <literal>WINDOW</>
+ list; the new window copies its partitioning clause from that entry,
+ as well as its ordering clause if any. In this case the new window cannot
+ specify its own <literal>PARTITION BY</> clause, and it can specify
+ <literal>ORDER BY</> only if the copied window does not have one.
+ </para>
+
+ <para>
+ The purpose of a <literal>WINDOW</literal> clause is to specify the
+ behavior of <firstterm>window functions</> appearing in the query's
+ <xref linkend="sql-select-list" endterm="sql-select-list-title"> or
+ <xref linkend="sql-orderby" endterm="sql-orderby-title">. These functions
+ can reference the <literal>WINDOW</literal> clause entries by name
+ in their <literal>OVER</> clauses. A <literal>WINDOW</literal> clause
+ entry does not have to be referenced anywhere, however; if it is not
+ used in the query it is simply ignored. It is possible to use window
+ functions without any <literal>WINDOW</literal> clause at all, since
+ a window function call can specify its window definition directly in
+ its <literal>OVER</> clause. However, the <literal>WINDOW</literal>
+ clause saves typing when the same window definition is needed for more
+ than one window function.
+ </para>
+
+ <para>
+ Window functions are described in detail in
+ <xref linkend="tutorial-window"> and
+ <xref linkend="syntax-window-functions">.
+ </para>
+ </refsect2>
+
<refsect2 id="sql-select-list">
<title id="sql-select-list-title"><command>SELECT</command> List</title>
@@ -922,7 +984,7 @@ FETCH { FIRST | NEXT } [ <replaceable class="parameter">count</replaceable> ] {
constants for the offset or fetch count, parentheses will be
necessary in most cases. If the fetch count is omitted, it
defaults to 1.
- </para>
+ </para>
<para>
When using <literal>LIMIT</>, it is a good idea to use an
@@ -1388,6 +1450,19 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
</refsect2>
<refsect2>
+ <title><literal>WINDOW</literal> Clause Restrictions</title>
+
+ <para>
+ The SQL standard provides for an optional <quote>framing clause</>,
+ introduced by the key word <literal>RANGE</> or <literal>ROWS</>,
+ in window definitions. <productname>PostgreSQL</productname> does
+ not yet implement framing clauses, and always follows the
+ default framing behavior, which is equivalent to the framing clause
+ <literal>ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW</>.
+ </para>
+ </refsect2>
+
+ <refsect2>
<title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>
<para>
diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml
index 038ae1b333c..057bfb2a9d7 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -1,5 +1,5 @@
<!--
-$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.43 2008/11/14 10:22:47 petere Exp $
+$PostgreSQL: pgsql/doc/src/sgml/ref/select_into.sgml,v 1.44 2008/12/28 18:53:54 tgl Exp $
PostgreSQL documentation
-->
@@ -29,6 +29,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
+ [ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index cca44794340..9d0833c2035 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.126 2008/12/09 20:52:03 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.127 2008/12/28 18:53:54 tgl Exp $ -->
<chapter id="sql-syntax">
<title>SQL Syntax</title>
@@ -1203,6 +1203,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
<listitem>
<para>
+ A window function call.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
A type cast.
</para>
</listitem>
@@ -1445,7 +1451,7 @@ $1.somecolumn
enclosed in parentheses:
<synopsis>
-<replaceable>function</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional> )
</synopsis>
</para>
@@ -1480,7 +1486,7 @@ sqrt(2)
<synopsis>
<replaceable>aggregate_name</replaceable> (<replaceable>expression</replaceable> [ , ... ] )
<replaceable>aggregate_name</replaceable> (ALL <replaceable>expression</replaceable> [ , ... ] )
-<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable> [ , ... ] )
+<replaceable>aggregate_name</replaceable> (DISTINCT <replaceable>expression</replaceable>)
<replaceable>aggregate_name</replaceable> ( * )
</synopsis>
@@ -1488,7 +1494,7 @@ sqrt(2)
defined aggregate (possibly qualified with a schema name), and
<replaceable>expression</replaceable> is
any value expression that does not itself contain an aggregate
- expression.
+ expression or a window function call.
</para>
<para>
@@ -1550,6 +1556,89 @@ sqrt(2)
</note>
</sect2>
+ <sect2 id="syntax-window-functions">
+ <title>Window Function Calls</title>
+
+ <indexterm zone="syntax-window-functions">
+ <primary>window function</primary>
+ <secondary>invocation</secondary>
+ </indexterm>
+
+ <indexterm zone="syntax-window-functions">
+ <primary>OVER clause</primary>
+ </indexterm>
+
+ <para>
+ A <firstterm>window function call</firstterm> represents the application
+ of an aggregate-like function over some portion of the rows selected
+ by a query. Unlike regular aggregate function calls, this is not tied
+ to grouping of the selected rows into a single output row &mdash; each
+ row remains separate in the query output. However the window function
+ is able to scan all the rows that would be part of the current row's
+ group according to the grouping specification (<literal>PARTITION BY</>
+ list) of the window function call.
+ The syntax of a window function call is one of the following:
+
+<synopsis>
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER ( <replaceable class="parameter">window_definition</replaceable> )
+<replaceable>function_name</replaceable> (<optional><replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ... </optional></optional>) OVER <replaceable>window_name</replaceable>
+<replaceable>function_name</replaceable> ( * ) OVER ( <replaceable class="parameter">window_definition</replaceable> )
+<replaceable>function_name</replaceable> ( * ) OVER <replaceable>window_name</replaceable>
+</synopsis>
+ where <replaceable class="parameter">window_definition</replaceable>
+ has the syntax
+<synopsis>
+[ <replaceable class="parameter">window_name</replaceable> ]
+[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
+[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
+</synopsis>
+
+ Here, <replaceable>expression</replaceable> represents any value
+ expression that does not itself contain window function calls.
+ The <literal>PARTITION BY</> and <literal>ORDER BY</> lists have
+ essentially the same syntax and semantics as <literal>GROUP BY</>
+ and <literal>ORDER BY</> clauses of the whole query.
+ <replaceable>window_name</replaceable> is a reference to a named window
+ specification defined in the query's <literal>WINDOW</literal> clause.
+ Named window specifications are usually referenced with just
+ <literal>OVER</> <replaceable>window_name</replaceable>, but it is
+ also possible to write a window name inside the parentheses and then
+ optionally override its ordering clause with <literal>ORDER BY</>.
+ This latter syntax follows the same rules as modifying an existing
+ window name within the <literal>WINDOW</literal> clause; see the
+ <xref linkend="sql-select" endterm="sql-select-title"> reference
+ page for details.
+ </para>
+
+ <para>
+ The built-in window functions are described in <xref
+ linkend="functions-window-table">. Also, any built-in or
+ user-defined aggregate function can be used as a window function.
+ Currently, there is no provision for user-defined window functions
+ other than aggregates.
+ </para>
+
+ <para>
+ The syntaxes using <literal>*</> are used for calling parameter-less
+ aggregate functions as window functions, for example
+ <literal>count(*) OVER (PARTITION BY x ORDER BY y)</>.
+ <literal>*</> is customarily not used for non-aggregate window functions.
+ Aggregate window functions, unlike normal aggregate functions, do not
+ allow <literal>DISTINCT</> to be used within the function argument list.
+ </para>
+
+ <para>
+ Window function calls are permitted only in the <literal>SELECT</literal>
+ list and the <literal>ORDER BY</> clause of the query.
+ </para>
+
+ <para>
+ More information about window functions can be found in
+ <xref linkend="tutorial-window"> and
+ <xref linkend="queries-window">.
+ </para>
+ </sect2>
+
<sect2 id="sql-syntax-type-casts">
<title>Type Casts</title>
diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml
index 3c4ce19258e..b223888f9ed 100644
--- a/doc/src/sgml/xaggr.sgml
+++ b/doc/src/sgml/xaggr.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.36 2008/11/20 21:10:44 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/xaggr.sgml,v 1.37 2008/12/28 18:53:54 tgl Exp $ -->
<sect1 id="xaggr">
<title>User-Defined Aggregates</title>
@@ -167,10 +167,13 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
<para>
A function written in C can detect that it is being called as an
aggregate transition or final function by seeing if it was passed
- an <structname>AggState</> node as the function call <quote>context</>,
+ an <structname>AggState</> or <structname>WindowAggState</> node
+ as the function call <quote>context</>,
for example by:
<programlisting>
- if (fcinfo->context &amp;&amp; IsA(fcinfo->context, AggState))
+ if (fcinfo-&gt;context &amp;&amp;
+ (IsA(fcinfo-&gt;context, AggState) ||
+ IsA(fcinfo-&gt;context, WindowAggState)))
</programlisting>
One reason for checking this is that when it is true, the first input
must be a temporary transition value and can therefore safely be modified