Make focal output dtype consistent across backends by brendancol · Pull Request #3226 · xarray-contrib/xarray-spatial

brendancol · 2026-06-10T17:41:16Z

mean(): drop the hardcoded float32 cast in _mean_cupy and _mean_dask_cupy. The function already casts input to float64 before dispatch, so all four backends now return float64 and the GPU result matches the CPU result exactly (the float32 cast cost ~1e-4 relative error).
apply() / focal_stats() / mean() dask paths: pass a typed meta (dtype=data.dtype) to every map_overlap call, so the lazy DataArray advertises the dtype the chunks actually compute. Before, float32 and integer input advertised float64 but computed float32. Same fix as aspect() planar dask backends report float64 dtype but compute float32 #2682 (aspect) and proximity/allocation/direction: output dtype and .name differ across backends #2723 (proximity).

Backend coverage: numpy, cupy, dask+numpy, dask+cupy all verified live on this host (CUDA available).

Test plan:

New parametrized tests: test_mean_dtype_consistent_across_backends_3217 (4 backends x 3 input dtypes), test_apply_dask_advertised_dtype_matches_computed_3217, test_focal_stats_dask_advertised_dtype_matches_computed_3217 (dask backends x 3 input dtypes), and test_mean_gpu_matches_cpu_float64_3217 (exact CPU/GPU value parity).
Full test_focal.py suite: 258 passed.

mean() cast to float32 on the cupy and dask+cupy paths while the CPU paths returned float64, losing precision on the GPU. The dask paths of mean, apply, and focal_stats also passed an untyped meta to map_overlap, so the lazy dtype advertised float64 while compute() returned float32 for float32/int input. Keep the dispatched dtype on the GPU mean paths and type every map_overlap meta with data.dtype. Adds parametrized regression tests over all four backends.

brendancol

PR Review: Make focal output dtype consistent across backends

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

xrspatial/focal.py:385-402: the mean() docstring's cupy example still shows output computed under the old float32 cast (array([[0.47928995, ...]]) printed at float32 precision) and an outdated cupy.core.core.ndarray class path. With this PR the GPU path computes in float64, so the example no longer matches what a user would see. Worth refreshing while you are touching this contract.

Nits (optional improvements)

xrspatial/tests/test_focal.py (new _computed_dtype helper): the helper is defined mid-file between test sections. Fine as is, but the general_checks.py module is where the other cross-backend helpers live if it ever gets a second user.

What looks good

The fix targets exactly the five sites that produce or advertise the wrong dtype, and nothing else. hotspots() meta sites were already typed and are untouched.
_mean_cupy keeps the device array and inherits the dispatched dtype instead of forcing float32; the new test asserts exact (not approximate) CPU/GPU equality in float64, which holds because both loops accumulate sequentially.
Tests cover 4 backends x 3 input dtypes for mean() and both dask backends x 3 dtypes for apply()/focal_stats(), asserting advertised dtype == computed dtype, which is the actual regression.
4x4 input with (2, 2) chunks exercises multi-chunk map_overlap, so the typed meta is checked against real chunk boundaries.

Checklist

Algorithm matches reference (no algorithm change; dtype plumbing only)
All implemented backends produce consistent results (verified live on CUDA host)
NaN handling unchanged and covered by existing suite (258 passed)
Edge cases covered (float64/float32/int32 inputs)
Dask chunk boundaries handled correctly (typed meta, depth unchanged)
No premature materialization (meta typing is graph-construction only)
Benchmark not needed (bug fix, no new function); note GPU mean now runs in float64, so a throughput drop on large GPU rasters is expected and intentional
README feature matrix not applicable
Docstrings: see suggestion about the stale cupy example

brendancol

Follow-up review (after `1f20224`)

The suggestion from the first pass is addressed: the mean() cupy docstring example now shows the float64 output (0.47928994, verified by running the example on a CUDA host) and the current cupy.ndarray class path.

Disposition of first-pass findings:

Suggestion (stale cupy docstring example): fixed in 1f20224.
Nit (_computed_dtype helper location): left in test_focal.py. It has a single consumer; moving a 4-line helper to general_checks.py before a second user exists adds indirection for no benefit.

No new findings. Full focal suite still passes (258).

…ocal-2026-06-10-01

…ocal-2026-06-10-01 Conflicts: - xrspatial/focal.py: main's #3221 (issue #3214, a duplicate of #3217) changed mean() to the _promote_float contract; kept main's GPU/dask cast lines, this branch keeps the typed map_overlap metas. - xrspatial/tests/test_focal.py: aligned the 3217 mean dtype test with the _promote_float contract (float32 in -> float32 out). - .claude/sweep-metadata-state.csv: restored LF line endings, kept main's geotiff row and this branch's focal row (notes updated).

github-actions Bot added the performance PR touches performance-sensitive code label Jun 10, 2026

brendancol commented Jun 10, 2026

View reviewed changes

Refresh mean() cupy docstring example for float64 output (#3217)

1f20224

brendancol commented Jun 10, 2026

View reviewed changes

brendancol added 4 commits June 10, 2026 10:43

Merge remote-tracking branch 'origin/main' into deep-sweep-metadata-f…

7ae0113

…ocal-2026-06-10-01

Update metadata sweep state for focal (#3217)

47f160a

Merge branch 'main' into deep-sweep-metadata-focal-2026-06-10-01

b7285c6

brendancol merged commit aace125 into main Jun 11, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make focal output dtype consistent across backends#3226

Make focal output dtype consistent across backends#3226
brendancol merged 6 commits into
mainfrom
deep-sweep-metadata-focal-2026-06-10-01

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Uh oh!

brendancol left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

PR Review: Make focal output dtype consistent across backends

Blockers (must fix before merge)

Suggestions (should fix, not blocking)

Nits (optional improvements)

What looks good

Checklist

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

Follow-up review (after 1f20224)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Follow-up review (after `1f20224`)