Skip to content

[AutoSparkUT] Migrate DataFramePivotSuite tests to RAPIDS#13761

Merged
wjxiz1992 merged 2 commits intoNVIDIA:mainfrom
wjxiz1992:dataframe-pivot-suite-migration
Nov 14, 2025
Merged

[AutoSparkUT] Migrate DataFramePivotSuite tests to RAPIDS#13761
wjxiz1992 merged 2 commits intoNVIDIA:mainfrom
wjxiz1992:dataframe-pivot-suite-migration

Conversation

@wjxiz1992
Copy link
Copy Markdown
Collaborator

Description

This PR migrates the DataFramePivotSuite test suite from Apache Spark to RAPIDS Accelerator for Apache Spark.

Test Results

Perfect Migration - 100% Pass Rate!

  • Total Tests: 31
  • Passed: 31 ✅
  • Failed: 0
  • Excluded: 0
  • Pass Rate: 100% 🌟

All tests pass successfully on GPU without any modifications or exclusions needed!

Test Coverage

The migrated tests cover:

  • Basic pivot operations (courses, year)
  • Multiple aggregations with pivot
  • Type conversions with pivot
  • Optimized pivot plans
  • PivotFirst supported datatypes
  • Nested columns pivoting
  • Array column pivoting
  • Special scenarios (null values, constants, timestamps)

Implementation Details

  • Created RapidsDataFramePivotSuite extending DataFramePivotSuite with RapidsSQLTestsBaseTrait
  • Registered test suite in RapidsTestSettings.scala
  • No exclusions needed - perfect GPU compatibility

Related Issues

Part of #11297 (Spark unit test migration effort)

Testing

mvn test -pl tests -Dbuildver=330 \
  -DwildcardSuites="org.apache.spark.sql.rapids.suites.RapidsDataFramePivotSuite" \
  -s jenkins/settings.xml -P mirror-apache-to-urm

Result: All tests passed.


Signed-off-by: Allen Xu wjxiz1992@gmail.com

@wjxiz1992 wjxiz1992 requested a review from a team as a code owner November 12, 2025 05:59
@wjxiz1992 wjxiz1992 changed the base branch from branch-25.12 to main November 12, 2025 05:59
- Created RapidsDataFramePivotSuite extending DataFramePivotSuite with RapidsSQLTestsBaseTrait
- Registered test suite in RapidsTestSettings.scala
- All 31 tests passing successfully with no exclusions needed:
  * pivot courses
  * pivot year
  * pivot courses with multiple aggregations
  * pivot year with string values (cast)
  * pivot year with int values
  * pivot courses with no values
  * pivot year with no values
  * pivot max values enforced
  * pivot with UnresolvedFunction
  * optimized pivot planned
  * optimized pivot courses with literals
  * optimized pivot year with literals
  * optimized pivot year with string values (cast)
  * optimized pivot DecimalType
  * PivotFirst supported datatypes
  * optimized pivot with multiple aggregations
  * pivot with datatype not supported by PivotFirst
  * pivot with datatype not supported by PivotFirst 2
  * pivot preserves aliases if given
  * pivot with column definition in groupby
  * pivot with null should not throw NPE
  * pivot with null and aggregate type not supported by PivotFirst returns correct result
  * pivot with timestamp and count should not print internal representation
  * SPARK-24722: pivoting nested columns
  * SPARK-24722: references to multiple columns in the pivot column
  * SPARK-24722: pivoting by a constant
  * SPARK-24722: aggregate as the pivot column
  * pivoting column list with values
  * SPARK-26403: pivoting by array column
  * SPARK-35480: percentile_approx should work with pivot
  * SPARK-38133: Grouping by TIMESTAMP_NTZ should not corrupt results
- Perfect compatibility with GPU execution - no issues found

This is an excellent test suite with 100% pass rate!

Signed-off-by: Allen Xu <wjxiz1992@gmail.com>
@wjxiz1992 wjxiz1992 force-pushed the dataframe-pivot-suite-migration branch from 80000e5 to a36fe8c Compare November 12, 2025 06:06
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

Migrates Apache Spark's DataFramePivotSuite test suite to run on GPU via the RAPIDS Accelerator plugin.

  • Created RapidsDataFramePivotSuite extending DataFramePivotSuite with RapidsSQLTestsBaseTrait
  • Registered the suite in RapidsTestSettings.scala with proper import placement
  • All 31 tests pass on GPU without exclusions
  • Follows established migration pattern consistently

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Perfect migration following established patterns with 100% test pass rate and no code modifications needed
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/suites/RapidsDataFramePivotSuite.scala 5/5 New test suite that extends Apache Spark's DataFramePivotSuite with RapidsSQLTestsBaseTrait for GPU testing
tests/src/test/spark330/scala/org/apache/spark/sql/rapids/utils/RapidsTestSettings.scala 5/5 Added import and registration for RapidsDataFramePivotSuite in alphabetical order

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Suite as RapidsDataFramePivotSuite
    participant Base as DataFramePivotSuite
    participant Trait as RapidsSQLTestsBaseTrait
    participant GPU as RAPIDS GPU Engine
    
    Dev->>Suite: Run test suite
    Suite->>Base: Inherit pivot tests
    Suite->>Trait: Mix in GPU configuration
    Trait->>Trait: Configure Spark with RAPIDS plugin
    Trait->>Trait: Enable GPU acceleration settings
    Base->>Suite: Execute inherited tests
    Suite->>GPU: Run tests on GPU
    GPU->>Suite: Return results
    Suite->>Dev: All 31 tests pass
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@wjxiz1992
Copy link
Copy Markdown
Collaborator Author

build

@wjxiz1992 wjxiz1992 merged commit 4415658 into NVIDIA:main Nov 14, 2025
60 checks passed
@sameerz sameerz added the test Only impacts tests label Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test Only impacts tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants