Make focal memory guard backend-aware for dask input (#3218) by brendancol · Pull Request #3228 · xarray-contrib/xarray-spatial

brendancol · 2026-06-10T17:42:35Z

_check_kernel_vs_raster_memory() now accepts the dask .chunks tuple and budgets the largest chunk plus the kernel halo instead of the full padded raster. map_overlap only materializes one padded chunk per task, so the full-raster term was a false positive that blocked any dask raster bigger than ~half host RAM from running apply(), focal_stats(), or hotspots().
numpy and cupy input keep the existing full-raster budget; those paths really do allocate full-size arrays.
The MemoryError message now says "chunk" or "raster" depending on which footprint was charged.

Verified: a 200000x200000 float32 lazy dask raster (160 GB) with a 3x3 kernel now builds graphs for all three entry points. Before the fix, all three raised MemoryError at graph construction while mean() (no guard) worked.

Backend coverage: dask+numpy and dask+cupy get the per-chunk budget; numpy and cupy are unchanged.

Test plan:

New tests: all 3 entry points accept a large dask raster under a patched 1 MB memory probe; numpy input is still rejected; an oversized kernel on dask is still rejected and the message reports the chunk
Full xrspatial/tests/test_focal.py: 238 passed
GPU sanity run of apply/focal_stats/hotspots on cupy and dask+cupy backends

brendancol

PR Review: Make focal memory guard backend-aware for dask input (#3218)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

xrspatial/focal.py:96-99: the per-chunk budget charges one padded chunk, but the threaded scheduler materializes one per worker concurrently, so true peak is roughly num_workers * padded_chunk. The 0.5-of-available headroom covers this for sane chunk sizes, and tightening it would risk reintroducing false rejections, so leaving it as is seems right. Worth a one-line comment if it ever bites.
xrspatial/focal.py:97: dask arrays with unknown chunk sizes (NaN chunks after boolean indexing) make max(chunks[-2]) NaN, and every comparison downstream is False, so the guard silently passes. dask's own map_overlap error fires later, so no crash; just noting the behavior is fall-through rather than explicit.

What looks good

Correct fix scope: numpy/cupy keep the full-raster budget (they really allocate full-size padded arrays), only chunked input switches to the per-task footprint. Mirrors the merge() guard fix from #3048.
chunks[-2] / chunks[-1] indexing is safe: 3D input recurses through _apply_per_band before the guard runs, and the negative indices would handle a 3D chunks tuple anyway.
The error message now names the unit it budgeted ("chunk" vs "raster"), and a test pins that wording.
Tests cover all three entry points accepting a large dask raster under a patched memory probe, the numpy rejection still firing, and the oversized-kernel-on-dask rejection. No .compute() anywhere, so the tests stay fast.
The existing #1284 tests (patched return_value=1) still pass, confirming the kernel-bytes term alone keeps rejecting absurd kernels on every backend.

Checklist

Algorithm matches reference (per-task footprint = largest chunk + 2*pad halo, which is what map_overlap allocates)
All implemented backends produce consistent results (guard change only; GPU sanity run on cupy and dask+cupy)
NaN handling is correct (no numeric path touched)
Edge cases are covered by tests (accept, reject-numpy, reject-oversized-kernel)
Dask chunk boundaries handled correctly
No premature materialization or unnecessary copies
Benchmark exists or is not needed (guard-only change, no compute path touched)
README feature matrix updated (n/a, no API change)
Docstrings present and accurate (helper docstring documents the chunks parameter)

…-through (#3218)

brendancol

Follow-up review after 89a07c2: both nits from the first pass are addressed by the new comment block in _check_kernel_vs_raster_memory (concurrency headroom rationale and the NaN-chunk fall-through behavior). The commit is comment-only; guard tests (3218 + 1284 set) still pass. No new findings.

…e-focal-2026-06-10-01

…ocal-2026-06-10-02 Conflicts: xrspatial/focal.py, xrspatial/tests/test_focal.py. Combined the dtype-aware itemsize budget (#3223) with main's chunk-aware budgeting (#3228); kept both sides' new tests and bumped the _promote_float spy count in the #3231 test by one for the guard's dtype-only call.

Budget per-chunk footprint in focal memory guard for dask input (#3218)

4fb6b9e

github-actions Bot added the performance PR touches performance-sensitive code label Jun 10, 2026

brendancol commented Jun 10, 2026

View reviewed changes

Address review nits: document concurrency headroom and NaN-chunk fall…

89a07c2

…-through (#3218)

brendancol commented Jun 10, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into deep-sweep-performanc…

acddfb9

…e-focal-2026-06-10-01

brendancol merged commit 3836bac into main Jun 10, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make focal memory guard backend-aware for dask input (#3218)#3228

Make focal memory guard backend-aware for dask input (#3218)#3228
brendancol merged 3 commits into
mainfrom
deep-sweep-performance-focal-2026-06-10-01

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Uh oh!

brendancol left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Jun 10, 2026

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

PR Review: Make focal memory guard backend-aware for dask input (#3218)

Blockers (must fix before merge)

Suggestions (should fix, not blocking)

Nits (optional improvements)

What looks good

Checklist

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant