[AutoSparkUT]Add Dataset, DataFrameFunctions and ColumnExpression suites#14157
[AutoSparkUT]Add Dataset, DataFrameFunctions and ColumnExpression suites#14157GaryShen2008 merged 3 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Gary Shen <gashen@nvidia.com>
RapidsDataFrameFunctionsSuite and RapidsColumnExpressionSuite Signed-off-by: Gary Shen <gashen@nvidia.com>
There was a problem hiding this comment.
Pull request overview
This pull request enables three Spark unit test suites for GPU execution: RapidsColumnExpressionSuite, RapidsDataFrameFunctionsSuite, and RapidsDatasetSuite. It adds test suite configuration entries and implements custom test cases for scenarios where GPU behavior differs from CPU (primarily ordering differences).
Changes:
- Added configuration entries in RapidsTestSettings.scala to enable three test suites with appropriate exclusions for known issues and adjusted tests
- Created RapidsColumnExpressionSuite.scala that extends Spark's ColumnExpressionSuite for GPU execution
- Created RapidsDataFrameFunctionsSuite.scala with a custom testRapids implementation for array_intersect that handles non-deterministic ordering
- Created RapidsDatasetSuite.scala with four custom testRapids implementations that sort results to handle non-deterministic ordering
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| RapidsTestSettings.scala | Added suite configurations with test exclusions for known issues and adjusted tests |
| RapidsColumnExpressionSuite.scala | New test suite extending Spark's ColumnExpressionSuite for GPU validation |
| RapidsDataFrameFunctionsSuite.scala | New test suite with custom array_intersect test handling ordering differences |
| RapidsDatasetSuite.scala | New test suite with four custom tests that sort results for consistent ordering |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/suites/RapidsDatasetSuite.scala
Show resolved
Hide resolved
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala
Outdated
Show resolved
Hide resolved
Greptile SummaryThis PR enables three Spark unit test suites for GPU execution: Key Changes:
Test Results:
All failures are properly documented with GitHub issue links and appropriate exclusion reasons ( Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Test as Test Execution
participant Suite as RapidsTestSuite
participant Trait as RapidsSQLTestsTrait
participant GPU as GPU Executor
participant Settings as RapidsTestSettings
Test->>Settings: Load test configuration
Settings->>Settings: enableSuite[RapidsColumnExpressionSuite]
Settings->>Settings: enableSuite[RapidsDataFrameFunctionsSuite]
Settings->>Settings: enableSuite[RapidsDatasetSuite]
Settings->>Settings: Apply exclusions for known issues
Test->>Suite: Execute test suite
Suite->>Trait: Inherit from RapidsSQLTestsTrait
alt Inherited Test
Suite->>Trait: Run original Spark test
Trait->>GPU: Execute on GPU via checkAnswer override
GPU->>Trait: Return GPU results
else testRapids Override
Suite->>Suite: Run custom testRapids implementation
Note over Suite: Sort results for deterministic ordering
Suite->>Trait: Compare results via checkAnswer
Trait->>GPU: Execute on GPU
GPU->>Trait: Return GPU results
end
Trait->>Trait: Compare expected vs actual results
Trait->>Test: Return test result (pass/fail)
|
Signed-off-by: Gary Shen <gashen@nvidia.com>
|
build |
Enable 3 Spark UT suites:
Summary: