Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD.

The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that *every* non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.
author: Tom Lane <tgl@sss.pgh.pa.us> 2014-01-11 13:41:41 -0500
committer: Tom Lane <tgl@sss.pgh.pa.us> 2014-01-11 13:41:51 -0500
commit: 36785a21ba6003ddb846cd736170b9c194e97766 (patch)
tree: 38c8a6df15d36ee728fd6a20b7f6086009318ff5 /src/backend/commands
parent: a25c2b7c4db3b4542e05d660e55bef5c93fdc32d (diff)
download: postgresql-36785a21ba6003ddb846cd736170b9c194e97766.tar.gz
postgresql-36785a21ba6003ddb846cd736170b9c194e97766.zip
1 files changed, 15 insertions, 1 deletions
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index d6d20fde9af..5f9674699e2 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -2731,7 +2731,21 @@ compute_scalar_stats(VacAttrStatsP stats,
 			slot_idx++;
 		}
 	}
-	else if (nonnull_cnt == 0 && null_cnt > 0)
+	else if (nonnull_cnt > 0)
+	{
+		/* We found some non-null values, but they were all too wide */
+		Assert(nonnull_cnt == toowide_cnt);
+		stats->stats_valid = true;
+		/* Do the simple null-frac and width stats */
+		stats->stanullfrac = (double) null_cnt / (double) samplerows;
+		if (is_varwidth)
+			stats->stawidth = total_width / (double) nonnull_cnt;
+		else
+			stats->stawidth = stats->attrtype->typlen;
+		/* Assume all too-wide values are distinct, so it's a unique column */
+		stats->stadistinct = -1.0;
+	}
+	else if (null_cnt > 0)
 	{
 		/* We found only nulls; assume the column is entirely null */
 		stats->stats_valid = true;
author	Tom Lane <tgl@sss.pgh.pa.us>	2014-01-11 13:41:41 -0500
committer	Tom Lane <tgl@sss.pgh.pa.us>	2014-01-11 13:41:51 -0500
commit	36785a21ba6003ddb846cd736170b9c194e97766 (patch)
tree	38c8a6df15d36ee728fd6a20b7f6086009318ff5 /src/backend/commands
parent	a25c2b7c4db3b4542e05d660e55bef5c93fdc32d (diff)
download	postgresql-36785a21ba6003ddb846cd736170b9c194e97766.tar.gz postgresql-36785a21ba6003ddb846cd736170b9c194e97766.zip