Make focal kernel memory guard dtype-aware (#3223)#3232
Merged
Conversation
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Make focal kernel memory guard dtype-aware (#3223)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- A companion test asserting that
hotspots()with float64 input is still budgeted at 4 bytes/cell (i.e. not over-rejected) would pin down the asymmetry the comment at focal.py:1758 describes. The existing #1284 hotspots test plus the comment already make the intent clear, so this is optional.
What looks good
- The
itemsizeis derived from_promote_float(agg.dtype), the same function the internals use to pick their compute dtype, so the budget and the actual allocations cannot drift apart again the way the hardcoded 4 did after #2805. hotspots()keeps the default 4 with a comment explaining why (it computes in float32 on every backend); passing the promoted itemsize there would have over-rejected float64 input.- The new tests are deterministic (patched
_available_memory_bytes, fixed 1 MB budget) and test both sides: float64 rejected at exactly the sizes that used to slip through, float32 still allowed. - The guard runs at the public entry points before backend dispatch, so the 3D per-band recursion and all four backends inherit the corrected budget.
np.dtype(...)works for numpy, cupy, and dask-backed DataArrays alike since.dtypeis a numpy dtype in all cases.
Checklist
- Algorithm matches reference (budget formula unchanged; only bytes/cell corrected)
- All implemented backends produce consistent results (guard is backend-independent)
- NaN handling is correct (not touched)
- Edge cases are covered by tests (pass/reject boundary on both dtypes)
- Dask chunk boundaries handled correctly (not touched)
- No premature materialization or unnecessary copies (dtype lookup only, no data access)
- Benchmark exists or is not needed (validation-path change, no kernel work)
- README feature matrix: no change needed
- Docstrings present and accurate (guard docstring updated with the #2805 history)
brendancol
commented
Jun 10, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up after the nit: added test_hotspots_float64_keeps_float32_budget_3223, which confirms float64 input to hotspots() stays on the 4 bytes/cell budget and is not over-rejected. Full focal suite: 236 passed. Nothing further from me.
…ocal-2026-06-10-02
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3223
_check_kernel_vs_raster_memorybudgeted a flat 4 bytes per cell ("focal internals cast to float32"). That stopped being true when Preserve input float dtype in apply() and focal_stats() (#2769) #2805 madeapply()andfocal_stats()preserve float64, so the guard underestimated float64 allocations by 2x and a kernel + raster combo could pass the check and then use up to ~100% of available memory, the OOM the guard exists to stop (focal apply()/focal_stats()/hotspots() accept unbounded user kernels #1284).itemsizeargument.apply()andfocal_stats()passnp.dtype(_promote_float(agg.dtype)).itemsize(8 for float64 input, 4 otherwise).hotspots()keeps the default 4 since it computes in float32 on every backend.Backend coverage: the guard runs at the public entry points before dispatch, so all four backends get the same check. No backend code changed.
Test plan:
test_apply_oversize_kernel_accounts_for_float64_3223: float64 combo sized to pass the old 4-byte budget now raises MemoryError; the same combo as float32 still runstest_focal_stats_oversize_kernel_accounts_for_float64_3223test_focal.pysuite: 235 passed