[AURON #2193] Implement native support for inner residual join conditions on SMJ/SHJ#2197
Open
weimingdiit wants to merge 4 commits intoapache:masterfrom
Open
Conversation
…conditions on SMJ/SHJ Signed-off-by: weimingdiit <weimingdiit@gmail.com>
353dd8d to
e02b0bc
Compare
Signed-off-by: weimingdiit <weimingdiit@gmail.com>
e02b0bc to
1538724
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Enables native conversion of Spark SMJ/SHJ inner joins that have an additional residual (non-equi) predicate by keeping the native join keyed-only and evaluating the residual predicate via a native Filter above the join.
Changes:
- Allow native SMJ/SHJ conversion for
InnerLikejoins with a residualcondition, applying it as a native filter above the join. - Add query tests covering SMJ/SHJ inner residual conditions (including force-SHJ mode) and a SparkConf test helper.
- Update TPCDS plan-stability golden files to reflect the new
NativeFilter/NativeProjectshapes.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| spark-extension/src/main/scala/org/apache/spark/sql/auron/AuronConverters.scala | Wrap inner residual join conditions with a native filter above native SMJ/SHJ output. |
| spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronSQLTestHelper.scala | Adds withSparkConf helper to temporarily set SparkConf/SparkEnv conf in tests. |
| spark-extension-shims-spark/src/test/scala/org/apache/auron/AuronQuerySuite.scala | Adds tests asserting native SMJ/SHJ presence for inner joins with residual predicates. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q95.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q92.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q85.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q81.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q72.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q68.txt | Updates golden plan to include NativeFilter/NativeProject changes. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q65.txt | Updates golden plan to include NativeFilter/NativeProject changes. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q64.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q6.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q48.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q46.txt | Updates golden plan to include NativeFilter/NativeProject changes. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q32.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q30.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q19.txt | Updates golden plan to include NativeFilter/NativeProject changes. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q15.txt | Updates golden plan to include NativeFilter above native join. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q13.txt | Updates golden plan to include NativeFilter/NativeProject changes. |
| dev/auron-it/src/main/resources/tpcds-plan-stability/spark-3.5/q1.txt | Updates golden plan to include NativeFilter above native join. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 26 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #2193
Rationale for this change
Auron currently only converts SMJ/SHJ joins when the join condition is empty. As a result, inner joins that contain both equi-join keys and a residual predicate fall back to Spark even though the equi-join part is already native-compatible.
The native join plan only models equi-join keys today, so this change keeps the native join focused on the equi-join portion and evaluates the residual predicate with a native filter above the join output.
What changes are included in this PR?
Are there any user-facing changes?
Yes. Inner joins with equi-join keys plus a residual predicate can now remain on the native SMJ/SHJ path instead of falling back entirely to Spark.
How was this patch tested?
CI.