You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the bounded-distance dask path of proximity/allocation/direction (finite max_distance, dask+numpy backend), in-range cells turn into NaN when the raster has irregularly spaced coordinates and is chunked along the column axis.
The cause is in _process_dask (xrspatial/proximity.py). It captures data = raster.data before sizing the halo, then calls _fit_halo_to_chunks, which can fold an axis into a single chunk and drop that axis's halo depth to zero. The folded result gets assigned back to raster.data, xs, and ys instead of to the local data. Two things break:
map_overlap is still handed the stale, un-folded data while xs/ys are the folded versions, so data and coordinate grids disagree on chunking. Pixels whose nearest target sits in a neighbouring column chunk lose that target and come out NaN even though it is well inside max_distance.
The bounded dask result matches the numpy reference for irregularly spaced coordinates, and the caller's DataArray chunking is left unchanged.
How to reproduce
coords = [0, 100, 101, 102, 103, 104, 105, 106] on both axes, a single target at data[3, 3], max_distance=2.5, dask chunks=(4, 2). The numpy result keeps the in-range targets; the dask result drops the ones in adjacent column chunks.
A regression test (test_bounded_dask_irregular_coords_matches_numpy) already exists and fails for all three of proximity, allocation, and direction.
Additional context
The EUCLIDEAN and MANHATTAN bounded paths are affected; the great-circle and unbounded paths run different code. The dask+cupy path (_process_dask_cupy) shares the same fold logic, so its handling should be checked too.
Describe the bug
On the bounded-distance dask path of proximity/allocation/direction (finite
max_distance, dask+numpy backend), in-range cells turn into NaN when the raster has irregularly spaced coordinates and is chunked along the column axis.The cause is in
_process_dask(xrspatial/proximity.py). It capturesdata = raster.databefore sizing the halo, then calls_fit_halo_to_chunks, which can fold an axis into a single chunk and drop that axis's halo depth to zero. The folded result gets assigned back toraster.data,xs, andysinstead of to the localdata. Two things break:map_overlapis still handed the stale, un-foldeddatawhilexs/ysare the folded versions, so data and coordinate grids disagree on chunking. Pixels whose nearest target sits in a neighbouring column chunk lose that target and come out NaN even though it is well insidemax_distance.raster.datamutates the caller's input DataArray. That is the mutation issue proximity: dask unbounded fallback mutates the caller's input chunks #2847 was meant to prevent, and the comment right above the line says not to do it.Expected behavior
The bounded dask result matches the numpy reference for irregularly spaced coordinates, and the caller's DataArray chunking is left unchanged.
How to reproduce
coords = [0, 100, 101, 102, 103, 104, 105, 106]on both axes, a single target atdata[3, 3],max_distance=2.5, daskchunks=(4, 2). The numpy result keeps the in-range targets; the dask result drops the ones in adjacent column chunks.A regression test (
test_bounded_dask_irregular_coords_matches_numpy) already exists and fails for all three of proximity, allocation, and direction.Additional context
The EUCLIDEAN and MANHATTAN bounded paths are affected; the great-circle and unbounded paths run different code. The dask+cupy path (
_process_dask_cupy) shares the same fold logic, so its handling should be checked too.