Skip to content

[AutoSparkUT]Enable RapidsCsvExpressionsSuite & RapidsCSVInferSchemaSuite#13911

Merged
GaryShen2008 merged 3 commits intoNVIDIA:mainfrom
GaryShen2008:add-csv-expression-suite
Dec 2, 2025
Merged

[AutoSparkUT]Enable RapidsCsvExpressionsSuite & RapidsCSVInferSchemaSuite#13911
GaryShen2008 merged 3 commits intoNVIDIA:mainfrom
GaryShen2008:add-csv-expression-suite

Conversation

@GaryShen2008
Copy link
Copy Markdown
Collaborator

@GaryShen2008 GaryShen2008 commented Dec 1, 2025

Add below suites:

  • RapidsCSVExpressionsSuite
    Rewrote a test case "unsupported mode" to change the expected exception type.
  • RapidsCSVInferSchemaSuite
  • RapidsReadSchemaSuite

Signed-off-by: Gary Shen <gashen@nvidia.com>
@GaryShen2008 GaryShen2008 changed the title Enable RapidsCsvExpressionsSuite [AutoSparkUT]Enable RapidsCsvExpressionsSuite Dec 1, 2025
@GaryShen2008 GaryShen2008 added the test Only impacts tests label Dec 1, 2025
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Dec 1, 2025

Greptile Overview

Greptile Summary

This PR enables two CSV test suites (RapidsCsvExpressionsSuite and RapidsCSVInferSchemaSuite) and adds 9 schema evolution test suite wrappers for multiple file formats (CSV, JSON, ORC, Parquet) to run on GPU.

Key Changes

  • RapidsCSVInferSchemaSuite: Wrapper that extends Spark's CSVInferSchemaSuite to test CSV schema inference logic on GPU
  • RapidsCsvExpressionsSuite: Wrapper that extends Spark's CsvExpressionsSuite with one custom override for the "unsupported mode" test to expect SparkException instead of TestFailedException
  • RapidsReadSchemaSuite.scala: Adds 9 new test suite wrappers for schema evolution testing across CSV (with/without header), JSON, ORC (non-vectorized, vectorized, merged), and Parquet (non-vectorized, vectorized, merged) formats
  • RapidsTestSettings: Registers all new test suites and excludes the original "unsupported mode" test from RapidsCsvExpressionsSuite with proper documentation

Test Pattern

All suites follow the established pattern of extending Spark's base test suites and mixing in RapidsTestsTrait or RapidsSQLTestsTrait to run tests on GPU. The only custom test override is in RapidsCsvExpressionsSuite where the exception type expectation differs between CPU and GPU execution.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The PR follows established patterns in the codebase for adding GPU test suite wrappers. All changes are test-only additions with proper documentation, correct trait mixins, and appropriate test exclusions. The single test override in RapidsCsvExpressionsSuite has a clear explanation and follows the same pattern seen in other suites like RapidsRandomSuite. No production code is modified.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/suites/RapidsCSVInferSchemaSuite.scala 5/5 Added new test suite wrapper for CSV schema inference that extends Spark's CSVInferSchemaSuite with GPU testing via RapidsTestsTrait
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/suites/RapidsCsvExpressionsSuite.scala 5/5 Added CSV expressions test suite with custom override of "unsupported mode" test to expect SparkException instead of test failure
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/suites/RapidsReadSchemaSuite.scala 5/5 Added multiple schema evolution test suite wrappers for CSV, JSON, ORC, and Parquet formats with vectorized and merged reader variants
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala 5/5 Registered new test suites and excluded the original "unsupported mode" test from RapidsCsvExpressionsSuite with explanation

Sequence Diagram

sequenceDiagram
    participant TestRunner as Test Runner
    participant RapidsSettings as RapidsTestSettings
    participant CSVInferSuite as RapidsCSVInferSchemaSuite
    participant CsvExprSuite as RapidsCsvExpressionsSuite
    participant ReadSchemaSuite as RapidsReadSchemaSuite
    participant RapidsTrait as RapidsTestsTrait/RapidsSQLTestsTrait
    participant SparkSuite as Spark Base Suites
    participant GPU as GPU Execution

    TestRunner->>RapidsSettings: Load test configuration
    RapidsSettings->>RapidsSettings: enableSuite[RapidsCSVInferSchemaSuite]
    RapidsSettings->>RapidsSettings: enableSuite[RapidsCsvExpressionsSuite]
    RapidsSettings->>RapidsSettings: enableSuite[RapidsCSVReadSchemaSuite]
    RapidsSettings->>RapidsSettings: exclude("unsupported mode") for CsvExpressions

    TestRunner->>CSVInferSuite: Run inherited tests
    CSVInferSuite->>SparkSuite: Extend CSVInferSchemaSuite
    CSVInferSuite->>RapidsTrait: Mix in RapidsTestsTrait
    RapidsTrait->>GPU: Execute tests on GPU
    GPU-->>CSVInferSuite: Test results

    TestRunner->>CsvExprSuite: Run inherited + custom tests
    CsvExprSuite->>SparkSuite: Extend CsvExpressionsSuite
    CsvExprSuite->>RapidsTrait: Mix in RapidsTestsTrait
    CsvExprSuite->>CsvExprSuite: testRapids("unsupported mode")
    Note over CsvExprSuite: Override test to expect SparkException
    RapidsTrait->>GPU: Execute tests on GPU
    GPU-->>CsvExprSuite: Test results with SparkException

    TestRunner->>ReadSchemaSuite: Run schema evolution tests
    ReadSchemaSuite->>SparkSuite: Extend multiple suite variants
    Note over ReadSchemaSuite: CSV, JSON, ORC, Parquet<br/>with vectorized/merged variants
    ReadSchemaSuite->>RapidsTrait: Mix in RapidsSQLTestsTrait
    RapidsTrait->>GPU: Execute tests on GPU
    GPU-->>ReadSchemaSuite: Test results
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Gary Shen <gashen@nvidia.com>
@GaryShen2008 GaryShen2008 changed the title [AutoSparkUT]Enable RapidsCsvExpressionsSuite [AutoSparkUT]Enable RapidsCsvExpressionsSuite & RapidsCSVInferSchemaSuite Dec 1, 2025
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Gary Shen <gashen@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@GaryShen2008
Copy link
Copy Markdown
Collaborator Author

build

Copy link
Copy Markdown
Collaborator

@wjxiz1992 wjxiz1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@GaryShen2008 GaryShen2008 merged commit 3c38435 into NVIDIA:main Dec 2, 2025
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test Only impacts tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants