Skip to content

Validate backend compatibility for 3D apply() inputs#2648

Merged
brendancol merged 5 commits into
mainfrom
issue-2639
May 29, 2026
Merged

Validate backend compatibility for 3D apply() inputs#2648
brendancol merged 5 commits into
mainfrom
issue-2639

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2639

zonal.apply() only called validate_arrays(zones, values) for 2D values. For 3D values it checked shape but never backend, so mixed dask/numpy inputs slipped through:

  • numpy zones + dask values: the dask backend crashed with AttributeError: 'numpy.ndarray' object has no attribute 'chunks'.
  • dask zones + numpy values: the numpy backend silently returned eager numpy output.

This adds an explicit backend check for the 3D path. validate_arrays can't be reused here because it requires equal full shapes (a 2D zones never equals a 3D values), so the check compares _classify_backend for each input and raises the same backend-mismatch error on a mismatch.

Backend coverage: no backend behavior changed for matched inputs. The change only adds an early error for mismatched-backend inputs across all four backends (numpy / cupy / dask+numpy / dask+cupy).

Test plan:

  • New test_apply_3d_mixed_backend_raises covers both mismatch directions and asserts a clear ValueError.
  • Verified the new test reproduces the original AttributeError without the fix.
  • pytest xrspatial/tests/test_zonal.py -k apply passes (14 passed).

Dedupe duplicate module rows (last-write-wins by last_inspected) and
collapse multi-line notes to single physical lines. The notes had
embedded newlines, which the merge=union .gitattributes strategy splits
record-by-record, corrupting the file into a 156-column phantom row on
parallel-agent appends. One line per record keeps union merges safe.
apply() only ran validate_arrays for 2D values. For 3D values it checked
shape but never backend, so mixed dask/numpy zones and values either
crashed in the dask backend with AttributeError on zones.chunks or
silently returned eager numpy output. Add an explicit backend check for
the 3D path that fails early with the same backend-mismatch message
validate_arrays uses.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Validate backend compatibility for 3D apply() inputs

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • xrspatial/zonal.py:1929-1934: the 3D backend check repeats the message string from validate_arrays (utils.py:377-382), but with "zones"/"values" labels instead of positional indices. That trade is fine here: the labels say more to the caller than "array 0"/"array 1", and validate_arrays can't be reused directly because it requires equal full shapes. A one-line note that the wording mirrors validate_arrays on purpose would keep the two greppable together. Not blocking.

What looks good

  • Minimal fix aimed at the exact gap: 3D values skipped backend validation while 2D went through validate_arrays.
  • The comment explains the non-obvious part, that validate_arrays can't be reused because a 2D zones never equals a 3D values.
  • Test covers both mismatch directions (numpy zones + dask values, dask zones + numpy values), and the PR verified it reproduces the original AttributeError without the fix.
  • _classify_backend distinguishes cupy from numpy too, so the same check catches cupy/numpy mismatches for free.

Checklist

  • Algorithm matches reference/paper: N/A, validation fix
  • Backends produce consistent results: unchanged for matched inputs; mismatched inputs now error early
  • NaN handling: unchanged
  • Edge cases covered by tests: both mismatch directions
  • Dask chunk boundaries: 3D dask path already rechunks layers to zones.chunks internally
  • No premature materialization or copies: none added
  • Benchmark: not needed, validation-only
  • README feature matrix: not applicable, no new function or backend change
  • Docstrings: apply() docstring unchanged and still accurate

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after nit fix)

Re-reviewed after commit 725dad9, which addresses the only finding from the first pass.

Blockers

None.

Suggestions

None.

Nits

None remaining. The earlier nit (note that the backend error wording mirrors validate_arrays) is now addressed by the comment at xrspatial/zonal.py:1933-1934.

Summary

The diff is the same focused validation fix plus a one-line comment. Tests still pass (14 apply tests). No new findings.

@brendancol brendancol merged commit 0ec33b1 into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zonal.apply() skips backend validation for 3D values, letting mixed dask/numpy inputs through

1 participant