Support "Right Anti Join" plan shapes.

Merge and hash joins can support antijoin with the non-nullable input on the right, using very simple combinations of their existing logic for right join and anti join. This gives the planner more freedom about how to order the join. It's particularly useful for hash join, since we may now have the option to hash the smaller table instead of the larger. Richard Guo, reviewed by Ronan Dunklau and myself Discussion: https://postgr.es/m/CAMbWs48xh9hMzXzSy3VaPzGAz+fkxXXTUbCLohX1_L8THFRm2Q@mail.gmail.com
author: Tom Lane <tgl@sss.pgh.pa.us> 2023-04-05 16:59:00 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2023-04-05 16:59:09 -0400
commit: 16dc2703c5413534d4989e08253e8f4fcb0e2aab (patch)
tree: c25f739d183d93510418b734ebd36fc3e2e9fde9 /src/backend/executor/nodeHashjoin.c
parent: dad50f677c42de207168a3f08982ba23c9fc6720 (diff)
download: postgresql-16dc2703c5413534d4989e08253e8f4fcb0e2aab.tar.gz
postgresql-16dc2703c5413534d4989e08253e8f4fcb0e2aab.zip
1 files changed, 23 insertions, 12 deletions
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 52ed05c6f5a..0a3f32f731d 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -86,7 +86,7 @@
  *  PHJ_BATCH_ALLOCATE*      -- one allocates buckets
  *  PHJ_BATCH_LOAD           -- all load the hash table from disk
  *  PHJ_BATCH_PROBE          -- all probe
- *  PHJ_BATCH_SCAN*          -- one does full/right unmatched scan
+ *  PHJ_BATCH_SCAN*          -- one does right/right-anti/full unmatched scan
  *  PHJ_BATCH_FREE*          -- one frees memory
  *
  * Batch 0 is a special case, because it starts out in phase
@@ -228,10 +228,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 
 				/*
 				 * If the outer relation is completely empty, and it's not
-				 * right/full join, we can quit without building the hash
-				 * table.  However, for an inner join it is only a win to
-				 * check this when the outer relation's startup cost is less
-				 * than the projected cost of building the hash table.
+				 * right/right-anti/full join, we can quit without building
+				 * the hash table.  However, for an inner join it is only a
+				 * win to check this when the outer relation's startup cost is
+				 * less than the projected cost of building the hash table.
 				 * Otherwise it's best to build the hash table first and see
 				 * if the inner relation is empty.  (When it's a left join, we
 				 * should always make this check, since we aren't going to be
@@ -520,6 +520,14 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 					}
 
 					/*
+					 * In a right-antijoin, we never return a matched tuple.
+					 * And we need to stay on the current outer tuple to
+					 * continue scanning the inner side for matches.
+					 */
+					if (node->js.jointype == JOIN_RIGHT_ANTI)
+						continue;
+
+					/*
 					 * If we only need to join to the first matching inner
 					 * tuple, then consider returning this one, but after that
 					 * continue with next outer tuple.
@@ -564,9 +572,10 @@ ExecHashJoinImpl(PlanState *pstate, bool parallel)
 			case HJ_FILL_INNER_TUPLES:
 
 				/*
-				 * We have finished a batch, but we are doing right/full join,
-				 * so any unmatched inner tuples in the hashtable have to be
-				 * emitted before we continue to the next batch.
+				 * We have finished a batch, but we are doing
+				 * right/right-anti/full join, so any unmatched inner tuples
+				 * in the hashtable have to be emitted before we continue to
+				 * the next batch.
 				 */
 				if (!(parallel ? ExecParallelScanHashTableForUnmatched(node, econtext)
 					  : ExecScanHashTableForUnmatched(node, econtext)))
@@ -732,6 +741,7 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
 				ExecInitNullTupleSlot(estate, innerDesc, &TTSOpsVirtual);
 			break;
 		case JOIN_RIGHT:
+		case JOIN_RIGHT_ANTI:
 			hjstate->hj_NullOuterTupleSlot =
 				ExecInitNullTupleSlot(estate, outerDesc, &TTSOpsVirtual);
 			break;
@@ -1027,8 +1037,9 @@ ExecHashJoinNewBatch(HashJoinState *hjstate)
 	 * side, but there are exceptions:
 	 *
 	 * 1. In a left/full outer join, we have to process outer batches even if
-	 * the inner batch is empty.  Similarly, in a right/full outer join, we
-	 * have to process inner batches even if the outer batch is empty.
+	 * the inner batch is empty.  Similarly, in a right/right-anti/full outer
+	 * join, we have to process inner batches even if the outer batch is
+	 * empty.
 	 *
 	 * 2. If we have increased nbatch since the initial estimate, we have to
 	 * scan inner batches since they might contain tuples that need to be
@@ -1349,8 +1360,8 @@ ExecReScanHashJoin(HashJoinState *node)
 			/*
 			 * Okay to reuse the hash table; needn't rescan inner, either.
 			 *
-			 * However, if it's a right/full join, we'd better reset the
-			 * inner-tuple match flags contained in the table.
+			 * However, if it's a right/right-anti/full join, we'd better
+			 * reset the inner-tuple match flags contained in the table.
 			 */
 			if (HJ_FILL_INNER(node))
 				ExecHashTableResetMatchFlags(node->hj_HashTable);
author	Tom Lane <tgl@sss.pgh.pa.us>	2023-04-05 16:59:00 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2023-04-05 16:59:09 -0400
commit	16dc2703c5413534d4989e08253e8f4fcb0e2aab (patch)
tree	c25f739d183d93510418b734ebd36fc3e2e9fde9 /src/backend/executor/nodeHashjoin.c
parent	dad50f677c42de207168a3f08982ba23c9fc6720 (diff)
download	postgresql-16dc2703c5413534d4989e08253e8f4fcb0e2aab.tar.gz postgresql-16dc2703c5413534d4989e08253e8f4fcb0e2aab.zip