aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2019-01-24 11:31:54 -0500
committerTom Lane <tgl@sss.pgh.pa.us>2019-01-24 11:31:54 -0500
commite6c3ba7fbfd59ceabcf9bdaf52d71b44831b09d2 (patch)
tree20b9ed2e577c6908b2cd4830be04323bb49108e2 /doc/src
parent19184fcc09739abf75ccdada965ed6135c6d07c3 (diff)
downloadpostgresql-e6c3ba7fbfd59ceabcf9bdaf52d71b44831b09d2.tar.gz
postgresql-e6c3ba7fbfd59ceabcf9bdaf52d71b44831b09d2.zip
Fix portability problem in pgbench.
The pgbench regression test supposed that srandom() with a specific value would result in deterministic output from random(), as required by POSIX. It emerges however that OpenBSD is too smart to be constrained by mere standards, so their random() emits nondeterministic output anyway. While a workaround does exist, what seems like a better fix is to stop relying on the platform's srandom()/random() altogether, so that what you get from --random-seed=N is not merely deterministic but platform independent. Hence, use a separate pg_jrand48() random sequence in place of random(). Also adjust the regression test case that's supposed to detect nondeterminism so that it's more likely to detect it; the original choice of random_zipfian parameter tended to produce the same output all the time even if the underlying behavior wasn't deterministic. In passing, improve pgbench's docs about random_zipfian(). Back-patch to v11 where this code was introduced. Fabien Coelho and Tom Lane Discussion: https://postgr.es/m/4615.1547792324@sss.pgh.pa.us
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/ref/pgbench.sgml19
1 files changed, 14 insertions, 5 deletions
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 15ee7c0f2ba..9d185248346 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1604,15 +1604,24 @@ f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
in (1, 1000), a rejection method is used, based on
"Non-Uniform Random Variate Generation", Luc Devroye, p. 550-551,
Springer 1986. The distribution is not defined when the parameter's
- value is 1.0. The drawing performance is poor for parameter values
+ value is 1.0. The function's performance is poor for parameter values
close and above 1.0 and on a small range.
</para>
<para>
- <replaceable>parameter</replaceable>
- defines how skewed the distribution is. The larger the <replaceable>parameter</replaceable>, the more
- frequently values to the beginning of the interval are drawn.
+ <replaceable>parameter</replaceable> defines how skewed the distribution
+ is. The larger the <replaceable>parameter</replaceable>, the more
+ frequently values closer to the beginning of the interval are drawn.
The closer to 0 <replaceable>parameter</replaceable> is,
- the flatter (more uniform) the access distribution.
+ the flatter (more uniform) the output distribution.
+ The distribution is such that, assuming the range starts from 1,
+ the ratio of the probability of drawing <replaceable>k</replaceable>
+ versus drawing <replaceable>k+1</replaceable> is
+ <literal>((<replaceable>k</replaceable>+1)/<replaceable>k</replaceable>)**<replaceable>parameter</replaceable></literal>.
+ For example, <literal>random_zipfian(1, ..., 2.5)</literal> produces
+ the value <literal>1</literal> about <literal>(2/1)**2.5 =
+ 5.66</literal> times more frequently than <literal>2</literal>, which
+ itself is produced <literal>(3/2)*2.5 = 2.76</literal> times more
+ frequently than <literal>3</literal>, and so on.
</para>
</listitem>
</itemizedlist>