diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2020-06-13 13:43:24 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2020-06-13 13:43:24 -0400 |
commit | 874372a941a6ee31ba4034db0367e0929e5f49bc (patch) | |
tree | edc8d95f6455518da6ae061821968b0696fb85d0 /src/backend/utils/adt/float.c | |
parent | 81cd796b825cf8d09bb6acea75482f9583e6a951 (diff) | |
download | postgresql-874372a941a6ee31ba4034db0367e0929e5f49bc.tar.gz postgresql-874372a941a6ee31ba4034db0367e0929e5f49bc.zip |
Fix behavior of float aggregates for single Inf or NaN inputs.
When there is just one non-null input value, and it is infinity or NaN,
aggregates such as stddev_pop and covar_pop should produce a NaN
result, because the calculation is not well-defined. They used to do
so, but since we adopted Youngs-Cramer aggregation in commit e954a727f,
they produced zero instead. That's an oversight, so fix it. Add tests
exercising these edge cases.
Affected aggregates are
var_pop(double precision)
stddev_pop(double precision)
var_pop(real)
stddev_pop(real)
regr_sxx(double precision,double precision)
regr_syy(double precision,double precision)
regr_sxy(double precision,double precision)
regr_r2(double precision,double precision)
regr_slope(double precision,double precision)
regr_intercept(double precision,double precision)
covar_pop(double precision,double precision)
corr(double precision,double precision)
Back-patch to v12 where the behavior change was accidentally introduced.
Report and patch by me; thanks to Dean Rasheed for review.
Discussion: https://postgr.es/m/353062.1591898766@sss.pgh.pa.us
Diffstat (limited to 'src/backend/utils/adt/float.c')
-rw-r--r-- | src/backend/utils/adt/float.c | 35 |
1 files changed, 35 insertions, 0 deletions
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c index 58194ccbdac..93355493fee 100644 --- a/src/backend/utils/adt/float.c +++ b/src/backend/utils/adt/float.c @@ -2932,6 +2932,17 @@ float8_accum(PG_FUNCTION_ARGS) Sxx = get_float8_nan(); } } + else + { + /* + * At the first input, we normally can leave Sxx as 0. However, if + * the first input is Inf or NaN, we'd better force Sxx to NaN; + * otherwise we will falsely report variance zero when there are no + * more inputs. + */ + if (isnan(newval) || isinf(newval)) + Sxx = get_float8_nan(); + } /* * If we're invoked as an aggregate, we can cheat and modify our first @@ -3006,6 +3017,17 @@ float4_accum(PG_FUNCTION_ARGS) Sxx = get_float8_nan(); } } + else + { + /* + * At the first input, we normally can leave Sxx as 0. However, if + * the first input is Inf or NaN, we'd better force Sxx to NaN; + * otherwise we will falsely report variance zero when there are no + * more inputs. + */ + if (isnan(newval) || isinf(newval)) + Sxx = get_float8_nan(); + } /* * If we're invoked as an aggregate, we can cheat and modify our first @@ -3232,6 +3254,19 @@ float8_regr_accum(PG_FUNCTION_ARGS) Sxy = get_float8_nan(); } } + else + { + /* + * At the first input, we normally can leave Sxx et al as 0. However, + * if the first input is Inf or NaN, we'd better force the dependent + * sums to NaN; otherwise we will falsely report variance zero when + * there are no more inputs. + */ + if (isnan(newvalX) || isinf(newvalX)) + Sxx = Sxy = get_float8_nan(); + if (isnan(newvalY) || isinf(newvalY)) + Syy = Sxy = get_float8_nan(); + } /* * If we're invoked as an aggregate, we can cheat and modify our first |