Account for optimized MinMax aggregates during SS_finalize_plan.

We are capable of optimizing MIN() and MAX() aggregates on indexed columns into subqueries that exploit the index, rather than the normal thing of scanning the whole table. When we do this, we replace the Aggref node(s) with Params referencing subquery outputs. Such Params really ought to be included in the per-plan-node extParam/allParam sets computed by SS_finalize_plan. However, we've never done so up to now because of an ancient implementation choice to perform that substitution during set_plan_references, which runs after SS_finalize_plan, so that SS_finalize_plan never sees these Params. The cleanest fix would be to perform a separate tree walk to do these substitutions before SS_finalize_plan runs. That seems unattractive, first because a whole-tree mutation pass is expensive, and second because we lack infrastructure for visiting expression subtrees in a Plan tree, so that we'd need a new function knowing as much as SS_finalize_plan knows about that. I also considered swapping the order of SS_finalize_plan and set_plan_references, but that fell foul of various assumptions that seem tricky to fix. So the approach adopted here is to teach SS_finalize_plan itself to check for such Aggrefs. I refactored things a bit in setrefs.c to avoid having three copies of the code that does that. Back-patch of v17 commits d0d44049d and 779ac2c74. When d0d44049d went in, there was no evidence that it was fixing a reachable bug, so I refrained from back-patching. Now we have such evidence. Per bug #18465 from Hal Takahara. Back-patch to all supported branches. Discussion: https://postgr.es/m/18465-2fae927718976b22@postgresql.org Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us
author: Tom Lane <tgl@sss.pgh.pa.us> 2024-05-18 14:31:35 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2024-05-18 14:31:35 -0400
commit: 686c995fc2223426706ee3f2472e220e7aac4041 (patch)
tree: a725cb2c46d4f3e59aab15a0c984282145098681 /src/test
parent: bcd2be0c2f7e494597ad02a69127c82478369b07 (diff)
download: postgresql-686c995fc2223426706ee3f2472e220e7aac4041.tar.gz
postgresql-686c995fc2223426706ee3f2472e220e7aac4041.zip
2 files changed, 41 insertions, 0 deletions
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 462457c9799..20c217a93b9 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1178,6 +1178,37 @@ NOTICE:  drop cascades to 3 other objects
 DETAIL:  drop cascades to table minmaxtest1
 drop cascades to table minmaxtest2
 drop cascades to table minmaxtest3
+-- DISTINCT can also trigger wrong answers with hash aggregation (bug #18465)
+begin;
+set local enable_sort = off;
+explain (costs off)
+  select f1, (select distinct min(t1.f1) from int4_tbl t1 where t1.f1 = t0.f1)
+  from int4_tbl t0;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on int4_tbl t0
+   SubPlan 2
+     ->  HashAggregate
+           Group Key: $1
+           InitPlan 1 (returns $1)
+             ->  Limit
+                   ->  Seq Scan on int4_tbl t1
+                         Filter: ((f1 IS NOT NULL) AND (f1 = t0.f1))
+           ->  Result
+(9 rows)
+
+select f1, (select distinct min(t1.f1) from int4_tbl t1 where t1.f1 = t0.f1)
+from int4_tbl t0;
+     f1      |     min     
+-------------+-------------
+           0 |           0
+      123456 |      123456
+     -123456 |     -123456
+  2147483647 |  2147483647
+ -2147483647 | -2147483647
+(5 rows)
+
+rollback;
 -- check for correct detection of nested-aggregate errors
 select max(min(unique1)) from tenk1;
 ERROR:  aggregate function calls cannot be nested
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 0f89b7680e4..e870e044248 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -394,6 +394,16 @@ select distinct min(f1), max(f1) from minmaxtest;
 
 drop table minmaxtest cascade;
 
+-- DISTINCT can also trigger wrong answers with hash aggregation (bug #18465)
+begin;
+set local enable_sort = off;
+explain (costs off)
+  select f1, (select distinct min(t1.f1) from int4_tbl t1 where t1.f1 = t0.f1)
+  from int4_tbl t0;
+select f1, (select distinct min(t1.f1) from int4_tbl t1 where t1.f1 = t0.f1)
+from int4_tbl t0;
+rollback;
+
 -- check for correct detection of nested-aggregate errors
 select max(min(unique1)) from tenk1;
 select (select max(min(unique1)) from int8_tbl) from tenk1;
author	Tom Lane <tgl@sss.pgh.pa.us>	2024-05-18 14:31:35 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2024-05-18 14:31:35 -0400
commit	686c995fc2223426706ee3f2472e220e7aac4041 (patch)
tree	a725cb2c46d4f3e59aab15a0c984282145098681 /src/test
parent	bcd2be0c2f7e494597ad02a69127c82478369b07 (diff)
download	postgresql-686c995fc2223426706ee3f2472e220e7aac4041.tar.gz postgresql-686c995fc2223426706ee3f2472e220e7aac4041.zip