Fix behavior of float aggregates for single Inf or NaN inputs.

When there is just one non-null input value, and it is infinity or NaN, aggregates such as stddev_pop and covar_pop should produce a NaN result, because the calculation is not well-defined. They used to do so, but since we adopted Youngs-Cramer aggregation in commit e954a727f, they produced zero instead. That's an oversight, so fix it. Add tests exercising these edge cases. Affected aggregates are var_pop(double precision) stddev_pop(double precision) var_pop(real) stddev_pop(real) regr_sxx(double precision,double precision) regr_syy(double precision,double precision) regr_sxy(double precision,double precision) regr_r2(double precision,double precision) regr_slope(double precision,double precision) regr_intercept(double precision,double precision) covar_pop(double precision,double precision) corr(double precision,double precision) Back-patch to v12 where the behavior change was accidentally introduced. Report and patch by me; thanks to Dean Rasheed for review. Discussion: https://postgr.es/m/353062.1591898766@sss.pgh.pa.us
author: Tom Lane <tgl@sss.pgh.pa.us> 2020-06-13 13:43:24 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2020-06-13 13:43:40 -0400
commit: 03109a53020e4663df3a8d822cdc665a7d712a93 (patch)
tree: a4c7c98f8e90ea76f8ea34e3a36ec63b66b2dcc2 /src/backend/utils/adt/float.c
parent: d64f1cdf2f4bef1454c74af1028c9ea0c3280322 (diff)
download: postgresql-03109a53020e4663df3a8d822cdc665a7d712a93.tar.gz
postgresql-03109a53020e4663df3a8d822cdc665a7d712a93.zip
1 files changed, 35 insertions, 0 deletions
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
index 2101d586744..6a717f19bba 100644
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2925,6 +2925,17 @@ float8_accum(PG_FUNCTION_ARGS)
 			Sxx = get_float8_nan();
 		}
 	}
+	else
+	{
+		/*
+		 * At the first input, we normally can leave Sxx as 0.  However, if
+		 * the first input is Inf or NaN, we'd better force Sxx to NaN;
+		 * otherwise we will falsely report variance zero when there are no
+		 * more inputs.
+		 */
+		if (isnan(newval) || isinf(newval))
+			Sxx = get_float8_nan();
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2999,6 +3010,17 @@ float4_accum(PG_FUNCTION_ARGS)
 			Sxx = get_float8_nan();
 		}
 	}
+	else
+	{
+		/*
+		 * At the first input, we normally can leave Sxx as 0.  However, if
+		 * the first input is Inf or NaN, we'd better force Sxx to NaN;
+		 * otherwise we will falsely report variance zero when there are no
+		 * more inputs.
+		 */
+		if (isnan(newval) || isinf(newval))
+			Sxx = get_float8_nan();
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -3225,6 +3247,19 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 				Sxy = get_float8_nan();
 		}
 	}
+	else
+	{
+		/*
+		 * At the first input, we normally can leave Sxx et al as 0.  However,
+		 * if the first input is Inf or NaN, we'd better force the dependent
+		 * sums to NaN; otherwise we will falsely report variance zero when
+		 * there are no more inputs.
+		 */
+		if (isnan(newvalX) || isinf(newvalX))
+			Sxx = Sxy = get_float8_nan();
+		if (isnan(newvalY) || isinf(newvalY))
+			Syy = Sxy = get_float8_nan();
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
author	Tom Lane <tgl@sss.pgh.pa.us>	2020-06-13 13:43:24 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2020-06-13 13:43:40 -0400
commit	03109a53020e4663df3a8d822cdc665a7d712a93 (patch)
tree	a4c7c98f8e90ea76f8ea34e3a36ec63b66b2dcc2 /src/backend/utils/adt/float.c
parent	d64f1cdf2f4bef1454c74af1028c9ea0c3280322 (diff)
download	postgresql-03109a53020e4663df3a8d822cdc665a7d712a93.tar.gz postgresql-03109a53020e4663df3a8d822cdc665a7d712a93.zip