Increase number of hash join buckets for underestimate.

If we expect batching at the very beginning, we size nbuckets for "full work_mem" (see how many tuples we can get into work_mem, while not breaking NTUP_PER_BUCKET threshold). If we expect to be fine without batching, we start with the 'right' nbuckets and track the optimal nbuckets as we go (without actually resizing the hash table). Once we hit work_mem (considering the optimal nbuckets value), we keep the value. At the end of the first batch, we check whether (nbuckets != nbuckets_optimal) and resize the hash table if needed. Also, we keep this value for all batches (it's OK because it assumes full work_mem, and it makes the batchno evaluation trivial). So the resize happens only once. There could be cases where it would improve performance to allow the NTUP_PER_BUCKET threshold to be exceeded to keep everything in one batch rather than spilling to a second batch, but attempts to generate such a case have so far been unsuccessful; that issue may be addressed with a follow-on patch after further investigation. Tomas Vondra with minor format and comment cleanup by me Reviewed by Robert Haas, Heikki Linnakangas, and Kevin Grittner
author: Kevin Grittner <kgrittn@postgresql.org> 2014-10-13 10:16:36 -0500
committer: Kevin Grittner <kgrittn@postgresql.org> 2014-10-13 10:16:36 -0500
commit: 30d7ae3c76d2de144232ae6ab328ca86b70e72c3 (patch)
tree: 1ef25acf6cbb5843eff05a82e3282d4c314d7bea /src/include/executor
parent: 494affbd900d1c90de17414a575af1a085c3e37a (diff)
download: postgresql-30d7ae3c76d2de144232ae6ab328ca86b70e72c3.tar.gz
postgresql-30d7ae3c76d2de144232ae6ab328ca86b70e72c3.zip
1 files changed, 5 insertions, 0 deletions
diff --git a/src/include/executor/hashjoin.h b/src/include/executor/hashjoin.h
index c9e61dfa39c..0e1e0cd5f0f 100644
--- a/src/include/executor/hashjoin.h
+++ b/src/include/executor/hashjoin.h
@@ -127,6 +127,10 @@ typedef struct HashJoinTableData
 	int			nbuckets;		/* # buckets in the in-memory hash table */
 	int			log2_nbuckets;	/* its log2 (nbuckets must be a power of 2) */
 
+	int			nbuckets_original;	/* # buckets when starting the first hash */
+	int			nbuckets_optimal;	/* optimal # buckets (per batch) */
+	int			log2_nbuckets_optimal;	/* same as log2_nbuckets optimal */
+
 	/* buckets[i] is head of list of tuples in i'th in-memory bucket */
 	struct HashJoinTupleData **buckets;
 	/* buckets array is per-batch storage, as are all the tuples */
@@ -148,6 +152,7 @@ typedef struct HashJoinTableData
 	bool		growEnabled;	/* flag to shut off nbatch increases */
 
 	double		totalTuples;	/* # tuples obtained from inner plan */
+	double		skewTuples;		/* # tuples inserted into skew tuples */
 
 	/*
 	 * These arrays are allocated for the life of the hash join, but only if
author	Kevin Grittner <kgrittn@postgresql.org>	2014-10-13 10:16:36 -0500
committer	Kevin Grittner <kgrittn@postgresql.org>	2014-10-13 10:16:36 -0500
commit	30d7ae3c76d2de144232ae6ab328ca86b70e72c3 (patch)
tree	1ef25acf6cbb5843eff05a82e3282d4c314d7bea /src/include/executor
parent	494affbd900d1c90de17414a575af1a085c3e37a (diff)
download	postgresql-30d7ae3c76d2de144232ae6ab328ca86b70e72c3.tar.gz postgresql-30d7ae3c76d2de144232ae6ab328ca86b70e72c3.zip