Fix owa() dask path and batch wpm() validation into one scheduler pass#3158
Merged
Conversation
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Fix owa() dask path and batch wpm() validation into one scheduler pass
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/tests/test_mcda.py(TestOWADask): the dask tests use even 10x10 chunks only. A ragged-chunk case (e.g.chunk({"y": 7, "x": 9})on a 20x20 grid) would catch block-alignment mistakes in the rechunk+map_blocks sort, which is exactly the code this PR adds.
Nits (optional improvements)
-
xrspatial/mcda/combine.py:387(_sort_block_descending): a one-line note that NaN ordering matches the eager-np.sort(-data)path (NaNs sort to the same end) would save the next reader from re-deriving it.
What looks good
- The per-block sort is exact because the criterion axis is rechunked into a single chunk first, and
meta=data._metakeeps dask+cupy blocks typed correctly. - The scheduler-pass regression test pins the actual behavior (one
dask.computecall with all three reductions) instead of timing anything. - The order-weights device copy fixes a second crash (plain-cupy owa) that the issue did not even cover, with parity tests on both GPU backends.
Checklist
- Algorithm matches reference (descending sort + order weights, parity vs eager path)
- All implemented backends produce consistent results (numpy/dask verified in tests; cupy/dask+cupy gated tests ran on a CUDA host)
- NaN handling is correct (NaN fixture in the dask parity test)
- Edge cases are covered by tests (non-positive rejection on dask, uniform order weights == wlc)
- Dask chunk boundaries handled correctly (axis fully contained per block)
- No premature materialization (validation now a single batched compute; sort path stays lazy)
- Benchmark exists or is not needed (no mcda benchmarks exist; not added here)
- README feature matrix: no tier changes needed (owa GPU cells already marked experimental)
- Docstrings present and accurate (helper docstrings added)
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review after b-commit:
- Ragged-chunk suggestion: fixed.
test_owa_dask_matches_numpyis now parametrized over even (10x10) and ragged (7x9) chunkings; both pass. - NaN-ordering nit: fixed with a note in
_sort_block_descendingexplaining parity with the eager-np.sort(-data)expression.
No remaining findings. 183 tests pass locally, including the GPU-gated cases on a CUDA host.
…e-mcda-2026-06-10-01 Conflicts: - .claude/sweep-performance-state.csv: took main's newer geotiff/rasterize/ reproject rows (LF endings) and re-inserted this branch's mcda row. Semantic fixups in xrspatial/tests/test_mcda.py (auto-merged textually): - Removed strict xfail markers that now XPASS: the three owa backend tests fixed by this PR, plus standardize cupy piecewise, standardize dask+cupy categorical (#3159) and constrain attrs (#3154), whose fixes merged before the #3156 xfails landed.
Main's #3157 landed the same owa dask-sort and cupy order-weights fixes this branch carried, so the merge takes main's combine.py for those hunks and drops this branch's _sort_block_descending helper. The wpm single-scheduler-pass validation (#3150) and its tests are unchanged. Test-comment conflicts resolved to main's side since the owa fix credit belongs to #3157.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3150
_sort_descendingcalledda.sort, which does not exist in dask, soowa()raised AttributeError on any dask-backed Dataset. It now rechunks the criterion axis to a single chunk and sorts each block withnp.sortviamap_blocks(exact, since the axis lives in one chunk; memory stays bounded by block size).metais forwarded so dask+cupy blocks keep their cupy meta.owa()now copies the order weights to the device when the data is cupy-backed (plain or dask), so both cupy paths work instead of raising TypeError._check_wpm_positiveranfloat(da.nanmin(arr).compute())once per criterion, i.e. one serialized scheduler pass per layer at call time. The nanmin reductions are now collected and run through a singledask.compute()call.Backend coverage: numpy unchanged; dask+numpy fixed (owa) and faster validation (wpm); cupy and dask+cupy owa now work and were verified on a CUDA host.
Test plan:
pytest xrspatial/tests/test_mcda.py(182 passed, includes 7 new tests)cuda_and_cupy_available): owa cupy and dask+cupy parity, run on a CUDA host