Doc: improve tutorial section about grouped aggregates.

Commit fede15417 introduced FILTER by jamming it into the existing example introducing HAVING, which seems pedagogically poor to me; and it added no information about what the keyword actually does. Not to mention that the claimed output didn't match the sample data being used in this running example. Revert that and instead make an independent example using FILTER. To help drive home the point that it's a per-aggregate filter, we need to use two aggregates not just one; for consistency expand all the examples in this segment to do that. Also adjust the example using WHERE ... LIKE so that it'd produce nonempty output with this sample data, and show that output. Back-patch, as the previous patch was. (Sadly, v10 is now out of scope.) Discussion: https://postgr.es/m/166794307526.652.9073408178177444190@wrigleys.postgresql.org
author: Tom Lane <tgl@sss.pgh.pa.us> 2022-11-08 18:25:03 -0500
committer: Tom Lane <tgl@sss.pgh.pa.us> 2022-11-08 18:25:03 -0500
commit: 362ba3e932c0f343a199b74f94ca7b22f4f8448d (patch)
tree: 7c21e1b6de046b6c67d37d1ccd2688a0931e731b
parent: c5dc80c1bc216f0e21a2f79f5e0415c2d4cfb35d (diff)
download: postgresql-362ba3e932c0f343a199b74f94ca7b22f4f8448d.tar.gz
postgresql-362ba3e932c0f343a199b74f94ca7b22f4f8448d.zip
1 files changed, 49 insertions, 16 deletions
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index 9046d7c9fbe..a864d146f02 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -706,40 +706,39 @@ SELECT city FROM weather
     <indexterm><primary>HAVING</primary></indexterm>
 
     Aggregates are also very useful in combination with <literal>GROUP
-    BY</literal> clauses.  For example, we can get the maximum low
-    temperature observed in each city with:
+    BY</literal> clauses.  For example, we can get the number of readings
+    and the maximum low temperature observed in each city with:
 
 <programlisting>
-SELECT city, max(temp_lo)
+SELECT city, count(*), max(temp_lo)
     FROM weather
     GROUP BY city;
 </programlisting>
 
 <screen>
-     city      | max
----------------+-----
- Hayward       |  37
- San Francisco |  46
+     city      | count | max
+---------------+-------+-----
+ Hayward       |     1 |  37
+ San Francisco |     2 |  46
 (2 rows)
 </screen>
 
     which gives us one output row per city.  Each aggregate result is
     computed over the table rows matching that city.
     We can filter these grouped
-    rows using <literal>HAVING</literal> and the output count using
-    <literal>FILTER</literal>:
+    rows using <literal>HAVING</literal>:
 
 <programlisting>
-SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo &lt; 30)
+SELECT city, count(*), max(temp_lo)
     FROM weather
     GROUP BY city
     HAVING max(temp_lo) &lt; 40;
 </programlisting>
 
 <screen>
-  city   | max | count
----------+-----+-------
- Hayward |  37 |     5
+  city   | count | max
+---------+-------+-----
+ Hayward |     1 |  37
 (1 row)
 </screen>
 
@@ -749,12 +748,18 @@ SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo &lt; 30)
     names begin with <quote><literal>S</literal></quote>, we might do:
 
 <programlisting>
-SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo &lt; 30)
+SELECT city, count(*), max(temp_lo)
     FROM weather
     WHERE city LIKE 'S%'            -- <co id="co.tutorial-agg-like"/>
-    GROUP BY city
-    HAVING max(temp_lo) &lt; 40;
+    GROUP BY city;
 </programlisting>
+
+<screen>
+     city      | count | max
+---------------+-------+-----
+ San Francisco |     2 |  46
+(1 row)
+</screen>
    <calloutlist>
     <callout arearefs="co.tutorial-agg-like">
      <para>
@@ -791,6 +796,34 @@ SELECT city, max(temp_lo), count(*) FILTER (WHERE temp_lo &lt; 30)
     because we avoid doing the grouping and aggregate calculations
     for all rows that fail the <literal>WHERE</literal> check.
    </para>
+
+   <para>
+    Another way to select the rows that go into an aggregate
+    computation is to use <literal>FILTER</literal>, which is a
+    per-aggregate option:
+
+<programlisting>
+SELECT city, count(*) FILTER (WHERE temp_lo &lt; 45), max(temp_lo)
+    FROM weather
+    GROUP BY city;
+</programlisting>
+
+<screen>
+     city      | count | max
+---------------+-------+-----
+ Hayward       |     1 |  37
+ San Francisco |     1 |  46
+(2 rows)
+</screen>
+
+    <literal>FILTER</literal> is much like <literal>WHERE</literal>,
+    except that it removes rows only from the input of the particular
+    aggregate function that it is attached to.
+    Here, the <literal>count</literal> aggregate counts only
+    rows with <literal>temp_lo</literal> below 45; but the
+    <literal>max</literal> aggregate is still applied to all rows,
+    so it still finds the reading of 46.
+   </para>
   </sect1>
author	Tom Lane <tgl@sss.pgh.pa.us>	2022-11-08 18:25:03 -0500
committer	Tom Lane <tgl@sss.pgh.pa.us>	2022-11-08 18:25:03 -0500
commit	362ba3e932c0f343a199b74f94ca7b22f4f8448d (patch)
tree	7c21e1b6de046b6c67d37d1ccd2688a0931e731b
parent	c5dc80c1bc216f0e21a2f79f5e0415c2d4cfb35d (diff)
download	postgresql-362ba3e932c0f343a199b74f94ca7b22f4f8448d.tar.gz postgresql-362ba3e932c0f343a199b74f94ca7b22f4f8448d.zip