diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2013-11-16 16:03:40 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2013-11-16 16:03:40 -0500 |
commit | 6cb86143e8e1e855255edc706bce71c6ebfd9a6c (patch) | |
tree | 2ed7cf0b5fe28b8ba858ae3e384534cdb7f31aa3 /doc/src | |
parent | 55c3d86a2a374f9d8fd88fd947601c1f49a4da08 (diff) | |
download | postgresql-6cb86143e8e1e855255edc706bce71c6ebfd9a6c.tar.gz postgresql-6cb86143e8e1e855255edc706bce71c6ebfd9a6c.zip |
Allow aggregates to provide estimates of their transition state data size.
Formerly the planner had a hard-wired rule of thumb for guessing the amount
of space consumed by an aggregate function's transition state data. This
estimate is critical to deciding whether it's OK to use hash aggregation,
and in many situations the built-in estimate isn't very good. This patch
adds a column to pg_aggregate wherein a per-aggregate estimate can be
provided, overriding the planner's default, and infrastructure for setting
the column via CREATE AGGREGATE.
It may be that additional smarts will be required in future, perhaps even
a per-aggregate estimation function. But this is already a step forward.
This is extracted from a larger patch to improve the performance of numeric
and int8 aggregates. I (tgl) thought it was worth reviewing and committing
this infrastructure separately. In this commit, all built-in aggregates
are given aggtransspace = 0, so no behavior should change.
Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/catalogs.sgml | 7 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_aggregate.sgml | 18 |
2 files changed, 25 insertions, 0 deletions
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 9388df5ac27..acc261ca516 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -373,6 +373,13 @@ <entry>Data type of the aggregate function's internal transition (state) data</entry> </row> <row> + <entry><structfield>aggtransspace</structfield></entry> + <entry><type>int4</type></entry> + <entry></entry> + <entry>Approximate average size (in bytes) of the transition state + data, or zero to use a default estimate</entry> + </row> + <row> <entry><structfield>agginitval</structfield></entry> <entry><type>text</type></entry> <entry></entry> diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml index 2b35fa4d522..17819dd1a8e 100644 --- a/doc/src/sgml/ref/create_aggregate.sgml +++ b/doc/src/sgml/ref/create_aggregate.sgml @@ -24,6 +24,7 @@ PostgreSQL documentation CREATE AGGREGATE <replaceable class="parameter">name</replaceable> ( [ <replaceable class="parameter">argmode</replaceable> ] [ <replaceable class="parameter">arg_name</replaceable> ] <replaceable class="parameter">arg_data_type</replaceable> [ , ... ] ) ( SFUNC = <replaceable class="PARAMETER">sfunc</replaceable>, STYPE = <replaceable class="PARAMETER">state_data_type</replaceable> + [ , SSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ] [ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ] [ , INITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ] [ , SORTOP = <replaceable class="PARAMETER">sort_operator</replaceable> ] @@ -35,6 +36,7 @@ CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> ( BASETYPE = <replaceable class="PARAMETER">base_type</replaceable>, SFUNC = <replaceable class="PARAMETER">sfunc</replaceable>, STYPE = <replaceable class="PARAMETER">state_data_type</replaceable> + [ , SSPACE = <replaceable class="PARAMETER">state_data_size</replaceable> ] [ , FINALFUNC = <replaceable class="PARAMETER">ffunc</replaceable> ] [ , INITCOND = <replaceable class="PARAMETER">initial_condition</replaceable> ] [ , SORTOP = <replaceable class="PARAMETER">sort_operator</replaceable> ] @@ -265,6 +267,22 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1; </varlistentry> <varlistentry> + <term><replaceable class="PARAMETER">state_data_size</replaceable></term> + <listitem> + <para> + The approximate average size (in bytes) of the aggregate's state value. + If this parameter is omitted or is zero, a default estimate is used + based on the <replaceable>state_data_type</>. + The planner uses this value to estimate the memory required for a + grouped aggregate query. The planner will consider using hash + aggregation for such a query only if the hash table is estimated to fit + in <xref linkend="guc-work-mem">; therefore, large values of this + parameter discourage use of hash aggregation. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><replaceable class="PARAMETER">ffunc</replaceable></term> <listitem> <para> |