diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2014-01-11 13:41:41 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2014-01-11 13:41:56 -0500 |
commit | f0381680fa480a8794d00566cef21178cd63d74e (patch) | |
tree | 1265d4f0672087238414b0c3837718e0fe0fe217 | |
parent | 799728b0baa5b9f7c8dd6991808d1ed049e83b76 (diff) | |
download | postgresql-f0381680fa480a8794d00566cef21178cd63d74e.tar.gz postgresql-f0381680fa480a8794d00566cef21178cd63d74e.zip |
Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD.
The standard typanalyze functions skip over values whose detoasted size
exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during
ANALYZE. However, we (I think I, actually :-() failed to consider the
possibility that *every* non-null value in a column is too wide. While
compute_minimal_stats() seems to behave reasonably anyway in such a case,
compute_scalar_stats() just fell through and generated no pg_statistic
entry at all. That's unnecessarily pessimistic: we can still produce
valid stanullfrac and stawidth values in such cases, since we do include
too-wide values in the average-width calculation. Furthermore, since the
general assumption in this code is that too-wide values are probably all
distinct from each other, it seems reasonable to set stadistinct to -1
("all distinct").
Per complaint from Kadri Raudsepp. This has been like this since roughly
neolithic times, so back-patch to all supported branches.
-rw-r--r-- | src/backend/commands/analyze.c | 16 |
1 files changed, 15 insertions, 1 deletions
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 9612a276f35..4fdb67d1b37 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -2727,7 +2727,21 @@ compute_scalar_stats(VacAttrStatsP stats, slot_idx++; } } - else if (nonnull_cnt == 0 && null_cnt > 0) + else if (nonnull_cnt > 0) + { + /* We found some non-null values, but they were all too wide */ + Assert(nonnull_cnt == toowide_cnt); + stats->stats_valid = true; + /* Do the simple null-frac and width stats */ + stats->stanullfrac = (double) null_cnt / (double) samplerows; + if (is_varwidth) + stats->stawidth = total_width / (double) nonnull_cnt; + else + stats->stawidth = stats->attrtype->typlen; + /* Assume all too-wide values are distinct, so it's a unique column */ + stats->stadistinct = -1.0; + } + else if (null_cnt > 0) { /* We found only nulls; assume the column is entirely null */ stats->stats_valid = true; |