Keep standardize piecewise/categorical on the device for cupy inputs#3159
Merged
Conversation
6 tasks
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Keep standardize piecewise/categorical on the device for cupy inputs
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/mcda/standardize.py:272(_interp_block): the breakpoint/value tables are converted withxp_b.asarrayinside the chunk function, so cupy chunks re-upload the (tiny) tables on every block. Converting once outside the closure, based on the dask array's_meta, removes the per-chunk transfer. Same applies tokeys/valsin_apply_mapping. Minor, since the tables are a few dozen bytes, but it is one line each.
Nits (optional improvements)
-
xrspatial/tests/test_mcda.py(TestStandardizeCupy): therawfixture is 2x4 with (1, 2) chunks, which exercises multiple blocks but not ragged chunking. Not critical for elementwise ops.
What looks good
- The fix picks the array module per block instead of guessing, so dask+numpy keeps its existing behavior byte for byte while dask+cupy stops crashing.
- All three failure modes from #3151 items 1-2 have device-residency assertions (
isinstance(..., cupy.ndarray)), not just value parity, so a silent fallback to host arrays would fail the test.
Checklist
- Algorithm matches reference (np.interp/cupy.interp parity asserted)
- All implemented backends produce consistent results (numpy path untouched; GPU paths verified on a CUDA host)
- NaN handling is correct (NaN cell in the fixture, equal_nan parity)
- Edge cases are covered by tests (unmapped categorical values stay NaN via parity with the numpy reference)
- Dask chunk boundaries handled correctly (elementwise per block)
- No premature materialization or unnecessary copies (apart from the per-chunk table upload noted above)
- Benchmark exists or is not needed (no mcda benchmarks exist; not added here)
- README feature matrix: standardize GPU cells already marked advanced; no tier change
- Docstrings present and accurate
… ragged GPU chunks (#3151)
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review after the second commit:
- Per-chunk table upload: fixed.
_piecewiseand_categoricalnow resolve the array module once fromdata._metaand convert the lookup tables a single time outside the chunk closure. - Ragged-chunk nit: the GPU fixture now chunks (1, 3) on the 2x4 grid, giving an uneven 3+1 split.
No remaining findings. 178 tests pass locally, including the three GPU-gated cases on a CUDA host.
brendancol
added a commit
that referenced
this pull request
Jun 10, 2026
…e-mcda-2026-06-10-01 Conflicts: - .claude/sweep-performance-state.csv: took main's newer geotiff/rasterize/ reproject rows (LF endings) and re-inserted this branch's mcda row. Semantic fixups in xrspatial/tests/test_mcda.py (auto-merged textually): - Removed strict xfail markers that now XPASS: the three owa backend tests fixed by this PR, plus standardize cupy piecewise, standardize dask+cupy categorical (#3159) and constrain attrs (#3154), whose fixes merged before the #3156 xfails landed.
brendancol
added a commit
that referenced
this pull request
Jun 10, 2026
…cda-2026-06-10 Conflicts: .claude/sweep-accuracy-state.csv, xrspatial/mcda/sensitivity.py, xrspatial/mcda/standardize.py, xrspatial/tests/test_mcda.py. sensitivity.py taken from main (#3160 chunk-bounded monte_carlo already handles cupy). standardize.py taken from main (#3159) plus this PR's ascontiguousarray for cupy.interp, which #3159 lacked. test_mcda.py keeps main's #3156 coverage with the strict xfail markers removed for the paths this PR fixes (owa cupy/dask, standardize piecewise/categorical GPU, monte-carlo sensitivity GPU) and for constrain attrs (fixed by #3154); the de-xfail'd monte_carlo assertions are now NaN-aware. Sweep CSV is main's rows plus this PR's mcda row.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the
standardize()part of #3151 (items 1 and 2). Themonte_carlosensitivity crash from that issue is handled separately together with #3152, since both need the same_monte_carlorework._piecewisehanded numpy lookup tables tocupy.interp, which rejects them (NotImplementedError). The tables are now converted with the data's own array module (a no-op for numpy), so plain-cupy piecewise works and stays on the device._piecewiseand_categoricaldask chunk functions callednp.asarray(block)with a comment claiming it handles cupy chunks. cupy refuses implicit conversion, so dask+cupy raised TypeError. The chunk functions now pick the block's own array module; cupy blocks are interpolated/mapped on the GPU and the result keeps its cupy meta.Backend coverage: numpy and dask+numpy unchanged (same code path,
xpis numpy); cupy piecewise and dask+cupy piecewise/categorical go from crashing to working, verified on a CUDA host with results staying in device memory end to end.Test plan:
pytest xrspatial/tests/test_mcda.py(178 passed)TestStandardizeCupytests gated oncuda_and_cupy_available: cupy piecewise, dask+cupy piecewise, dask+cupy categorical, each asserting parity with the numpy result and that the output is still a cupy array