Stop downcasting mean() GPU paths to float32 (#3222) by brendancol · Pull Request #3229 · xarray-contrib/xarray-spatial

brendancol · 2026-06-10T17:43:15Z

_mean_cupy and _mean_dask_cupy cast their input to float32, so a float64 raster came back float32 on GPU while the CPU paths returned float64. On values around 1e7 the GPU result was off by up to 0.5, and the error compounded with passes > 1.
Both paths now promote with _promote_float, which keeps float64 and promotes non-float input, matching what Preserve input float dtype in apply() and focal_stats() (#2769) #2805 did for apply() and focal_stats().
The state CSV commit is from the security sweep that found this.

Backend coverage: numpy and dask+numpy are unchanged (they already computed in float64). cupy and dask+cupy now match them bit-for-bit on the regression input.

Test plan:

test_mean_preserves_float64_gpu_3222 (cupy, runs on GPU)
test_mean_preserves_float64_dask_gpu_3222 (dask+cupy)
Full test_focal.py suite: 235 passed on a CUDA host

brendancol

PR Review: Stop downcasting mean() GPU paths to float32 (#3222)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

xrspatial/tests/test_focal.py:101 - test_mean_transfer_function_dask_gpu still compares with rtol=1e-4, a tolerance that existed to absorb the float32 drift this PR removes. It can be tightened to the default now that all four backends compute in float64. Fine to leave if you want to keep that test's scope unchanged.

Nits (optional improvements)

The new regression tests cover a single pass. A passes=3 case would also lock in that the error no longer compounds, though single-pass parity already implies it since each pass runs the same code.

What looks good

The fix reuses _promote_float, the same helper #2805 used for apply()/focal_stats(), so mean() now follows the module convention instead of a one-off cast.
Casting excludes to data_cu.dtype (focal.py:266) closes a second, subtler mismatch: exclude values were previously compared in float32 on GPU but float64 on CPU.
Both new tests assert dtype and values, ran on real CUDA hardware, and the +1e7 offset makes the old bug fail loudly (abs error 0.5) rather than slipping under a tolerance.
_mean_cupy_boundary's pad-then-trim path inherits the fix since _pad_array preserves dtype.

Checklist

Algorithm matches reference (no algorithm change; dtype only)
All implemented backends produce consistent results (float64 parity verified on GPU)
NaN handling is correct (unchanged; isnan checks are dtype-agnostic)
Edge cases are covered by tests (large-offset values; existing suite covers NaN/boundary)
Dask chunk boundaries handled correctly (depth=(1,1) unchanged; chunked test added)
No premature materialization or unnecessary copies (one astype, same as before)
Benchmark exists or is not needed (no perf-relevant change for float32 inputs; float64 inputs now do real float64 work on GPU, which is the point)
README feature matrix: no change needed
Docstrings present and accurate (no public API change)

…y check (#3222)

brendancol

Follow-up review after 62016a0:

Suggestion (rtol=1e-4 in test_mean_transfer_function_dask_gpu): fixed. The comparison now uses the default tolerance with a comment explaining why the float32 allowance is gone.
Nit (multi-pass coverage): fixed. test_mean_preserves_float64_gpu_3222 now also compares passes=3 output between numpy and cupy.

All 28 mean-related tests pass on a CUDA host. Nothing further from me.

…ocal-2026-06-10-01

Main's #3221 (issue #3214) landed the same float64-preservation fix for the mean() GPU paths that this branch made for #3222, so the merge takes main's focal.py wholesale. The branch's regression tests survive: multi-pass GPU parity and dask+cupy large-offset parity are not covered by #3221's tests, and the old rtol=1e-4 allowance on the dask+gpu transfer test can tighten now that every backend computes in float64.

brendancol · 2026-06-11T00:22:06Z

Closing as superseded by #3221, which merged first with the same _promote_float change in _mean_cupy/_mean_dask_cupy plus the mean() entry-point fix, excludes dtype handling, and broader dtype tests. The only bits here not on main are a passes=3 compounding test and a dask+cupy large-offset parity test; those can ride along with a future focal test PR if wanted.

brendancol added 2 commits June 10, 2026 10:41

Update security sweep state for focal (issues #3222, #3223)

9dc0a04

Stop downcasting mean() GPU paths to float32 (#3222)

c57431d

github-actions Bot added the performance PR touches performance-sensitive code label Jun 10, 2026

brendancol commented Jun 10, 2026

View reviewed changes

Address review: tighten dask+gpu mean tolerance, add multi-pass parit…

62016a0

…y check (#3222)

brendancol commented Jun 10, 2026

View reviewed changes

brendancol added 2 commits June 10, 2026 10:50

Merge remote-tracking branch 'origin/main' into deep-sweep-security-f…

000eee8

…ocal-2026-06-10-01

brendancol closed this Jun 11, 2026

brendancol mentioned this pull request Jun 11, 2026

focal mean() computes in float32 on GPU but float64 on CPU #3222

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop downcasting mean() GPU paths to float32 (#3222)#3229

Stop downcasting mean() GPU paths to float32 (#3222)#3229
brendancol wants to merge 5 commits into
mainfrom
deep-sweep-security-focal-2026-06-10-01

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Uh oh!

brendancol left a comment

Uh oh!

brendancol commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

PR Review: Stop downcasting mean() GPU paths to float32 (#3222)

Blockers (must fix before merge)

Suggestions (should fix, not blocking)

Nits (optional improvements)

What looks good

Checklist

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

Uh oh!

brendancol commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant