Make focal memory guard backend-aware for dask input (#3218)#3228
Merged
Conversation
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Make focal memory guard backend-aware for dask input (#3218)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
-
xrspatial/focal.py:96-99: the per-chunk budget charges one padded chunk, but the threaded scheduler materializes one per worker concurrently, so true peak is roughlynum_workers * padded_chunk. The 0.5-of-available headroom covers this for sane chunk sizes, and tightening it would risk reintroducing false rejections, so leaving it as is seems right. Worth a one-line comment if it ever bites. -
xrspatial/focal.py:97: dask arrays with unknown chunk sizes (NaN chunks after boolean indexing) makemax(chunks[-2])NaN, and every comparison downstream is False, so the guard silently passes. dask's own map_overlap error fires later, so no crash; just noting the behavior is fall-through rather than explicit.
What looks good
- Correct fix scope: numpy/cupy keep the full-raster budget (they really allocate full-size padded arrays), only chunked input switches to the per-task footprint. Mirrors the merge() guard fix from #3048.
chunks[-2]/chunks[-1]indexing is safe: 3D input recurses through_apply_per_bandbefore the guard runs, and the negative indices would handle a 3D chunks tuple anyway.- The error message now names the unit it budgeted ("chunk" vs "raster"), and a test pins that wording.
- Tests cover all three entry points accepting a large dask raster under a patched memory probe, the numpy rejection still firing, and the oversized-kernel-on-dask rejection. No
.compute()anywhere, so the tests stay fast. - The existing #1284 tests (patched
return_value=1) still pass, confirming the kernel-bytes term alone keeps rejecting absurd kernels on every backend.
Checklist
- Algorithm matches reference (per-task footprint = largest chunk + 2*pad halo, which is what map_overlap allocates)
- All implemented backends produce consistent results (guard change only; GPU sanity run on cupy and dask+cupy)
- NaN handling is correct (no numeric path touched)
- Edge cases are covered by tests (accept, reject-numpy, reject-oversized-kernel)
- Dask chunk boundaries handled correctly
- No premature materialization or unnecessary copies
- Benchmark exists or is not needed (guard-only change, no compute path touched)
- README feature matrix updated (n/a, no API change)
- Docstrings present and accurate (helper docstring documents the chunks parameter)
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review after 89a07c2: both nits from the first pass are addressed by the new comment block in _check_kernel_vs_raster_memory (concurrency headroom rationale and the NaN-chunk fall-through behavior). The commit is comment-only; guard tests (3218 + 1284 set) still pass. No new findings.
…e-focal-2026-06-10-01
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3218
_check_kernel_vs_raster_memory()now accepts the dask.chunkstuple and budgets the largest chunk plus the kernel halo instead of the full padded raster.map_overlaponly materializes one padded chunk per task, so the full-raster term was a false positive that blocked any dask raster bigger than ~half host RAM from runningapply(),focal_stats(), orhotspots().Verified: a 200000x200000 float32 lazy dask raster (160 GB) with a 3x3 kernel now builds graphs for all three entry points. Before the fix, all three raised MemoryError at graph construction while
mean()(no guard) worked.Backend coverage: dask+numpy and dask+cupy get the per-chunk budget; numpy and cupy are unchanged.
Test plan:
xrspatial/tests/test_focal.py: 238 passed