reproject: preserve integer dtype in dask+cupy fast path (#2505)#2509
Merged
Conversation
_reproject_dask_cupy's eager fast path force-allocated the output as cp.float64 regardless of source dtype, so int16/uint8/etc. inputs were silently promoted to float64. The other three backends (numpy, cupy, dask+numpy) and the chunked dask+cupy fallback already preserve the source integer dtype. Compute out_dtype the same way _reproject_dask does (integer source -> source dtype, float source -> float64), allocate the result buffer with out_dtype, and clamp+cast each chunk before placement -- mirroring the per-band cast in _reproject_chunk_cupy.
New TestDaskCupyDtypeParity class mirrors TestDaskDtypeParity. The six new tests cover int8 / int16 / uint8 / uint16 / float32 inputs plus a direct dask+numpy vs dask+cupy cross-backend check. Each test fails on pre-fix code because the eager fast path in _reproject_dask_cupy always allocated float64. The tests pass on the fixed code where out_dtype tracks the source dtype. Gated on HAS_DASK and HAS_CUPY since dask+cupy is required to hit the specific code path.
brendancol
commented
May 27, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: reproject: preserve integer dtype in dask+cupy fast path (#2505)
Blockers
None.
Suggestions
None.
Nits
- The new
TestDaskCupyDtypeParitytests checkresult.dtypeonly. The siblingTestDaskDtypeParitychecks bothresult.data.dtypeandresult.compute().dtype. Since_reproject_dask_cupyreturns an eager cupy ndarray (not a dask array),result.compute()is a no-op so the two forms are equivalent here. Fine as-is; addresult.data.dtypeif you want exact symmetry with the sibling class.
What looks good
- Same pattern as
_reproject_dask(lines 1839-1843) and_reproject_chunk_cupy(lines 609-611, 633). No new patterns introduced. - Clamp+cast on
chunk_dataruns at the per-chunk write level, so chunks skipped viacontinueretain the pre-filledout_dtypenodata. - The
nodatavalue reaching_reproject_dask_cupyis already integer-compatible when the source is integer (resolved by_detect_nodata(..., dtype=raster.dtype)at line 851), so thecp.full(out_shape, nodata, dtype=int_dtype)allocation can't hit a NaN→int conversion. - New tests fail on pre-fix code and pass on fixed code (verified by temporarily reverting).
- Six tests covering int8 / int16 / uint8 / uint16 / float32 plus a direct dask+numpy vs dask+cupy parity check.
Checklist
- Pattern matches
_reproject_daskand_reproject_chunk_cupy - All backends now produce consistent dtypes
- NaN handling unchanged
- Edge cases: signed, unsigned, float, skipped chunks
- Dask chunk boundaries unaffected
- No premature materialization
- Benchmark not needed (correctness fix)
- README feature matrix unaffected
- Inline comments reference #2505
Mirror the assertion pattern used by TestDaskDtypeParity so the new TestDaskCupyDtypeParity class checks both result.dtype and result.data.dtype. The two are equivalent here because the eager fast path returns a plain cupy ndarray, but the symmetry helps future readers.
brendancol
commented
May 27, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (#2505)
Addressed the nit from the previous review: TestDaskCupyDtypeParity now asserts both result.dtype and result.data.dtype, matching the symmetry used in TestDaskDtypeParity.
No outstanding findings. All 6 new tests still pass.
Disposition of original review:
- Nit (assertion symmetry): fixed in bd45b82.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2505.
Summary
_reproject_dask_cupyeager fast path force-allocated the output buffer ascp.float64regardless of source dtype, so int16/uint8/etc. inputs were silently promoted to float64 while every other backend kept the source dtype.out_dtypethe same way_reproject_daskdoes (integer source keeps its dtype, float source stays float64), allocate the result without_dtype, and clamp+cast each chunk before placement -- mirrors the per-band cast in_reproject_chunk_cupy.TestDaskCupyDtypeParityclass with six tests covering int8 / int16 / uint8 / uint16 / float32 plus a direct dask+numpy vs dask+cupy cross-backend dtype check.Backend coverage
Test plan
pytest xrspatial/tests/test_reproject.py::TestDaskCupyDtypeParity-- 6 passingpytest xrspatial/tests/test_reproject.py-- 351 passing (345 baseline + 6 new)