aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/nodeHash.c
diff options
context:
space:
mode:
authorAndres Freund <andres@anarazel.de>2019-08-02 00:02:46 -0700
committerAndres Freund <andres@anarazel.de>2019-08-02 00:02:46 -0700
commit2abd7ae9b20bcd810d4f19d28aefb97048813825 (patch)
tree9c6c87d5b07353c77a9608f866bfce5ef38e0cfa /src/backend/executor/nodeHash.c
parenta9f301df0e76c38d4544477c1b3e5e29d57904e6 (diff)
downloadpostgresql-2abd7ae9b20bcd810d4f19d28aefb97048813825.tar.gz
postgresql-2abd7ae9b20bcd810d4f19d28aefb97048813825.zip
Fix representation of hash keys in Hash/HashJoin nodes.
In 5f32b29c1819 I changed the creation of HashState.hashkeys to actually use HashState as the parent (instead of HashJoinState, which was incorrect, as they were executed below HashState), to fix the problem of hashkeys expressions otherwise relying on slot types appropriate for HashJoinState, rather than HashState as would be correct. That reliance was only introduced in 12, which is why it previously worked to use HashJoinState as the parent (although I'd be unsurprised if there were problematic cases). Unfortunately that's not a sufficient solution, because before this commit, the to-be-hashed expressions referenced inner/outer as appropriate for the HashJoin, not Hash. That didn't have obvious bad consequences, because the slots containing the tuples were put into ecxt_innertuple when hashing a tuple for HashState (even though Hash doesn't have an inner plan). There are less common cases where this can cause visible problems however (rather than just confusion when inspecting such executor trees). E.g. "ERROR: bogus varno: 65000", when explaining queries containing a HashJoin where the subsidiary Hash node's hash keys reference a subplan. While normally hashkeys aren't displayed by EXPLAIN, if one of those expressions references a subplan, that subplan may be printed as part of the Hash node - which then failed because an inner plan was referenced, and Hash doesn't have that. It seems quite possible that there's other broken cases, too. Fix the problem by properly splitting the expression for the HashJoin and Hash nodes at plan time, and have them reference the proper subsidiary node. While other workarounds are possible, fixing this correctly seems easy enough. It was a pretty ugly hack to have ExecInitHashJoin put the expression into the already initialized HashState, in the first place. I decided to not just split inner/outer hashkeys inside make_hashjoin(), but also to separate out hashoperators and hashcollations at plan time. Otherwise we would have ended up having two very similar loops, one at plan time and the other during executor startup. The work seems to more appropriately belong to plan time, anyway. Reported-By: Nikita Glukhov, Alexander Korotkov Author: Andres Freund Reviewed-By: Tom Lane, in an earlier version Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com Backpatch: 12-
Diffstat (limited to 'src/backend/executor/nodeHash.c')
-rw-r--r--src/backend/executor/nodeHash.c22
1 files changed, 14 insertions, 8 deletions
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index d16120b9c48..224cbb32bad 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -157,7 +157,8 @@ MultiExecPrivateHash(HashState *node)
econtext = node->ps.ps_ExprContext;
/*
- * get all inner tuples and insert into the hash table (or temp files)
+ * Get all tuples from the node below the Hash node and insert into the
+ * hash table (or temp files).
*/
for (;;)
{
@@ -165,7 +166,7 @@ MultiExecPrivateHash(HashState *node)
if (TupIsNull(slot))
break;
/* We have to compute the hash value */
- econtext->ecxt_innertuple = slot;
+ econtext->ecxt_outertuple = slot;
if (ExecHashGetHashValue(hashtable, econtext, hashkeys,
false, hashtable->keepNulls,
&hashvalue))
@@ -281,7 +282,7 @@ MultiExecParallelHash(HashState *node)
slot = ExecProcNode(outerNode);
if (TupIsNull(slot))
break;
- econtext->ecxt_innertuple = slot;
+ econtext->ecxt_outertuple = slot;
if (ExecHashGetHashValue(hashtable, econtext, hashkeys,
false, hashtable->keepNulls,
&hashvalue))
@@ -388,8 +389,9 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
/*
* initialize child expressions
*/
- hashstate->ps.qual =
- ExecInitQual(node->plan.qual, (PlanState *) hashstate);
+ Assert(node->plan.qual == NIL);
+ hashstate->hashkeys =
+ ExecInitExprList(node->hashkeys, (PlanState *) hashstate);
return hashstate;
}
@@ -1773,9 +1775,13 @@ ExecParallelHashTableInsertCurrentBatch(HashJoinTable hashtable,
* ExecHashGetHashValue
* Compute the hash value for a tuple
*
- * The tuple to be tested must be in either econtext->ecxt_outertuple or
- * econtext->ecxt_innertuple. Vars in the hashkeys expressions should have
- * varno either OUTER_VAR or INNER_VAR.
+ * The tuple to be tested must be in econtext->ecxt_outertuple (thus Vars in
+ * the hashkeys expressions need to have OUTER_VAR as varno). If outer_tuple
+ * is false (meaning it's the HashJoin's inner node, Hash), econtext,
+ * hashkeys, and slot need to be from Hash, with hashkeys/slot referencing and
+ * being suitable for tuples from the node below the Hash. Conversely, if
+ * outer_tuple is true, econtext is from HashJoin, and hashkeys/slot need to
+ * be appropriate for tuples from HashJoin's outer node.
*
* A true result means the tuple's hash value has been successfully computed
* and stored at *hashvalue. A false result means the tuple cannot match