Skip to content

[AURON #2170] Correctness Testing: All Spark Versions - Add Aggregate operator related tests #2213

Draft
ShreyeshArangath wants to merge 6 commits intoapache:masterfrom
ShreyeshArangath:feat/correctness-aggregate-2170
Draft

[AURON #2170] Correctness Testing: All Spark Versions - Add Aggregate operator related tests #2213
ShreyeshArangath wants to merge 6 commits intoapache:masterfrom
ShreyeshArangath:feat/correctness-aggregate-2170

Conversation

@ShreyeshArangath
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #2170

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

How was this patch tested?

Introduce empty test modules for Spark 3.1/3.2/3.4/3.5/4.0/4.1 alongside
the existing spark33 module. Each module ships only a Maven pom and an
empty AuronSparkTestSettings stub so that profile activation and the
reflection lookup in common/SparkTestSettings both succeed.

Per-area suites (Aggregate/Sort/Parquet/Functions/Expressions) will land
in separate follow-up PRs tracked under apache#2170-apache#2174.
…ests across all versions

Wire up a new spark-tests.yml workflow that exercises the
auron-spark-tests module for every supported Spark profile
(3.1/3.2/3.3/3.4/3.5/4.0/4.1) using the JDK+Scala combos
already validated in tpcds.yml.

Build step installs the Auron extension + spark-tests modules
with tests skipped, then a scoped `mvn test` targets only
auron-spark-tests/common + the per-version submodule so the
job does not redundantly re-run every other module's tests.
…/3.4/3.5/4.0/4.1

Mirror the three aggregate suites from spark33 (AuronDataFrameAggregateSuite,
AuronDatasetAggregatorSuite, AuronTypedImperativeAggregateSuite) and wire
them into each per-version AuronSparkTestSettings with the same exclude
list (collect functions prefix, SPARK-19471 overridden locally, SPARK-24788)
so the matrix CI exercises aggregates on every supported Spark profile.
@ShreyeshArangath ShreyeshArangath changed the title [AURON #2170]Correctness Testing: All Spark Versions - Add Aggregate operator related tests [AURON #2170] Correctness Testing: All Spark Versions - Add Aggregate operator related tests Apr 23, 2026
…ests-common

The common test module compiles against every Spark version we support, but
it called several APIs that were reshaped in Spark 4:

* `Column.apply(Expression)` was removed — the classic module now exposes
  it as `ExpressionUtils.column(expr)`.
* `SparkSession.internalCreateDataFrame` lives on `classic.SparkSession`
  in 4.x and requires the `isStreaming` argument.
* `DataFrame.logicalPlan` is no longer on the `api.Dataset` trait, and the
  `SQLExecution.withSQLConfPropagated` overload now takes a
  `classic.SparkSession` rather than the abstract `SparkSession`.

Wrap the two Spark-4-only calls in `@sparkver` helpers so the right
implementation is emitted under each profile, switch to
`df.queryExecution.logical` / `df.queryExecution.sparkSession` (both
public on `QueryExecution` across every supported version and returning
the concrete session type in 4.x), and pull in the
`spark-version-annotation-macros` dependency the helpers need.
…e aggregates

Three ported DataFrameAggregateSuite tests fail not because of a regression
but because they assert on Spark-specific internals that Auron's native
aggregation deliberately replaces:

* Spark 3.2 SPARK-34837 (`avg` on ANSI intervals) emits invalid Java when
  Spark's HashAggregate codegen consumes values produced by Auron's native
  project; later Spark versions avoid this path.
* Spark 3.5 SPARK-16484 negative tests assert the thrown error implements
  `SparkThrowable`, but `SparkUDAFWrapper` surfaces UDAF failures as
  `RuntimeException`.
* Spark 3.5 SPARK-43876 greps for `public class hashAgg_FastHashMap_0` in
  the WholeStageCodegen output, which never exists when the aggregate runs
  natively.

Exclude these three tests in the relevant per-version
`AuronSparkTestSettings`, matching the existing precedent for the
SPARK-19471 / SPARK-24788 cases.
@github-actions github-actions Bot added the build label Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Correctness Testing] All Spark Versions - Add Aggregate operator related tests

1 participant