aboutsummaryrefslogtreecommitdiff
path: root/doc/src/sgml/xaggr.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/xaggr.sgml')
-rw-r--r--doc/src/sgml/xaggr.sgml197
1 files changed, 183 insertions, 14 deletions
diff --git a/doc/src/sgml/xaggr.sgml b/doc/src/sgml/xaggr.sgml
index e77ef12e5c3..cbbb0519115 100644
--- a/doc/src/sgml/xaggr.sgml
+++ b/doc/src/sgml/xaggr.sgml
@@ -132,6 +132,161 @@ CREATE AGGREGATE avg (float8)
</note>
<para>
+ Aggregate function calls in SQL allow <literal>DISTINCT</>
+ and <literal>ORDER BY</> options that control which rows are fed
+ to the aggregate's transition function and in what order. These
+ options are implemented behind the scenes and are not the concern
+ of the aggregate's support functions.
+ </para>
+
+ <para>
+ For further details see the
+ <xref linkend="sql-createaggregate">
+ command.
+ </para>
+
+ <sect2 id="xaggr-moving-aggregates">
+ <title>Moving-Aggregate Mode</title>
+
+ <indexterm>
+ <primary>moving-aggregate mode</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>aggregate function</primary>
+ <secondary>moving aggregate</secondary>
+ </indexterm>
+
+ <para>
+ Aggregate functions can optionally support <firstterm>moving-aggregate
+ mode</>, which allows substantially faster execution of aggregate
+ functions within windows with moving frame starting points.
+ (See <xref linkend="tutorial-window">
+ and <xref linkend="syntax-window-functions"> for information about use of
+ aggregate functions as window functions.)
+ The basic idea is that in addition to a normal <quote>forward</>
+ transition function, the aggregate provides an <firstterm>inverse
+ transition function</>, which allows rows to be removed from the
+ aggregate's running state value when they exit the window frame.
+ For example a <function>sum</> aggregate, which uses addition as the
+ forward transition function, would use subtraction as the inverse
+ transition function. Without an inverse transition function, the window
+ function mechanism must recalculate the aggregate from scratch each time
+ the frame starting point moves, resulting in run time proportional to the
+ number of input rows times the average frame length. With an inverse
+ transition function, the run time is only proportional to the number of
+ input rows.
+ </para>
+
+ <para>
+ The inverse transition function is passed the current state value and the
+ aggregate input value(s) for the earliest row included in the current
+ state. It must reconstruct what the state value would have been if the
+ given input value had never been aggregated, but only the rows following
+ it. This sometimes requires that the forward transition function keep
+ more state than is needed for plain aggregation mode. Therefore, the
+ moving-aggregate mode uses a completely separate implementation from the
+ plain mode: it has its own state data type, its own forward transition
+ function, and its own final function if needed. These can be the same as
+ the plain mode's data type and functions, if there is no need for extra
+ state.
+ </para>
+
+ <para>
+ As an example, we could extend the <function>sum</> aggregate given above
+ to support moving-aggregate mode like this:
+
+<programlisting>
+CREATE AGGREGATE sum (complex)
+(
+ sfunc = complex_add,
+ stype = complex,
+ initcond = '(0,0)',
+ msfunc = complex_add,
+ minvfunc = complex_sub,
+ mstype = complex,
+ minitcond = '(0,0)'
+);
+</programlisting>
+
+ The parameters whose names begin with <literal>m</> define the
+ moving-aggregate implementation. Except for the inverse transition
+ function <literal>minvfunc</>, they correspond to the plain-aggregate
+ parameters without <literal>m</>.
+ </para>
+
+ <para>
+ The forward transition function for moving-aggregate mode is not allowed
+ to return NULL as the new state value. If the inverse transition
+ function returns NULL, this is taken as an indication that the inverse
+ function cannot reverse the state calculation for this particular input,
+ and so the aggregate calculation will be redone from scratch for the
+ current frame starting position. This convention allows moving-aggregate
+ mode to be used in situations where there are some infrequent cases that
+ are impractical to reverse out of the running state value. The inverse
+ transition function can <quote>punt</> on these cases, and yet still come
+ out ahead so long as it can work for most cases. As an example, an
+ aggregate working with floating-point numbers might choose to punt when
+ a <literal>NaN</> (not a number) input has to be removed from the running
+ state value.
+ </para>
+
+ <para>
+ When writing moving-aggregate support functions, it is important to be
+ sure that the inverse transition function can reconstruct the correct
+ state value exactly. Otherwise there might be user-visible differences
+ in results depending on whether the moving-aggregate mode is used.
+ An example of an aggregate for which adding an inverse transition
+ function seems easy at first, yet where this requirement cannot be met
+ is <function>sum</> over <type>float4</> or <type>float8</> inputs. A
+ naive declaration of <function>sum(<type>float8</>)</function> could be
+
+<programlisting>
+CREATE AGGREGATE unsafe_sum (float8)
+(
+ stype = float8,
+ sfunc = float8pl,
+ mstype = float8,
+ msfunc = float8pl,
+ minvfunc = float8mi
+);
+</programlisting>
+
+ This aggregate, however, can give wildly different results than it would
+ have without the inverse transition function. For example, consider
+
+<programlisting>
+SELECT
+ unsafe_sum(x) OVER (ORDER BY n ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
+FROM (VALUES (1, 1.0e20::float8),
+ (2, 1.0::float8)) AS v (n,x);
+</programlisting>
+
+ This query returns <literal>0</> as its second result, rather than the
+ expected answer of <literal>1</>. The cause is the limited precision of
+ floating-point values: adding <literal>1</> to <literal>1e20</> results
+ in <literal>1e20</> again, and so subtracting <literal>1e20</> from that
+ yields <literal>0</>, not <literal>1</>. Note that this is a limitation
+ of floating-point arithmetic in general, not a limitation
+ of <productname>PostgreSQL</>.
+ </para>
+
+ </sect2>
+
+ <sect2 id="xaggr-polymorphic-aggregates">
+ <title>Polymorphic and Variadic Aggregates</title>
+
+ <indexterm>
+ <primary>aggregate function</primary>
+ <secondary>polymorphic</secondary>
+ </indexterm>
+
+ <indexterm>
+ <primary>aggregate function</primary>
+ <secondary>variadic</secondary>
+ </indexterm>
+
+ <para>
Aggregate functions can use polymorphic
state transition functions or final functions, so that the same functions
can be used to implement multiple aggregates.
@@ -189,8 +344,8 @@ SELECT attrelid::regclass, array_accum(atttypid::regtype)
by declaring its last argument as a <literal>VARIADIC</> array, in much
the same fashion as for regular functions; see
<xref linkend="xfunc-sql-variadic-functions">. The aggregate's transition
- function must have the same array type as its last argument. The
- transition function typically would also be marked <literal>VARIADIC</>,
+ function(s) must have the same array type as their last argument. The
+ transition function(s) typically would also be marked <literal>VARIADIC</>,
but this is not strictly required.
</para>
@@ -220,13 +375,15 @@ SELECT myaggregate(a, b, c ORDER BY a) FROM ...
</para>
</note>
- <para>
- Aggregate function calls in SQL allow <literal>DISTINCT</>
- and <literal>ORDER BY</> options that control which rows are fed
- to the aggregate's transition function and in what order. These
- options are implemented behind the scenes and are not the concern
- of the aggregate's support functions.
- </para>
+ </sect2>
+
+ <sect2 id="xaggr-ordered-set-aggregates">
+ <title>Ordered-Set Aggregates</title>
+
+ <indexterm>
+ <primary>aggregate function</primary>
+ <secondary>ordered set</secondary>
+ </indexterm>
<para>
The aggregates we have been describing so far are <quote>normal</>
@@ -312,6 +469,21 @@ SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY income) FROM households;
</para>
<para>
+ Currently, ordered-set aggregates cannot be used as window functions,
+ and therefore there is no need for them to support moving-aggregate mode.
+ </para>
+
+ </sect2>
+
+ <sect2 id="xaggr-support-functions">
+ <title>Support Functions for Aggregates</title>
+
+ <indexterm>
+ <primary>aggregate function</primary>
+ <secondary>support functions for</secondary>
+ </indexterm>
+
+ <para>
A function written in C can detect that it is being called as an
aggregate transition or final function by calling
<function>AggCheckCallContext</>, for example:
@@ -341,9 +513,6 @@ if (AggCheckCallContext(fcinfo, NULL))
source code.
</para>
- <para>
- For further details see the
- <xref linkend="sql-createaggregate">
- command.
- </para>
+ </sect2>
+
</sect1>