Description
_check_kernel_vs_raster_memory() in xrspatial/focal.py budgets 4 bytes per cell, with a comment saying "focal internals cast to float32". That was true when the guard was added, but #2805 changed apply() and focal_stats() to preserve float64 input, so for float64 rasters the padded raster and kernel scratch arrays now take 8 bytes per cell. The guard underestimates the real allocation by 2x.
The guard rejects combos above 50% of available memory. With the 2x underestimate, a float64 raster + kernel combo can pass the guard and then allocate up to ~100% of available memory, which is the OOM scenario the guard was added to stop (#1284).
Repro on a host with ~45 GB available:
import numpy as np, math
from xrspatial.focal import _check_kernel_vs_raster_memory
from xrspatial.convolution import _available_memory_bytes
avail = _available_memory_bytes()
k = int(math.isqrt(int(0.30 * avail / 8))) | 1 # 41173 here
_check_kernel_vs_raster_memory(np.ones((k, k)), 101, 101, 'apply')
# passes; the actual float64 footprint is ~60% of available,
# and sized near the guard's threshold it reaches ~100%
Expected behavior
The guard budgets bytes per cell from the dtype the focal internals will actually use (_promote_float of the input dtype): 8 bytes for float64 input, 4 otherwise.
Description
_check_kernel_vs_raster_memory()inxrspatial/focal.pybudgets 4 bytes per cell, with a comment saying "focal internals cast to float32". That was true when the guard was added, but #2805 changedapply()andfocal_stats()to preserve float64 input, so for float64 rasters the padded raster and kernel scratch arrays now take 8 bytes per cell. The guard underestimates the real allocation by 2x.The guard rejects combos above 50% of available memory. With the 2x underestimate, a float64 raster + kernel combo can pass the guard and then allocate up to ~100% of available memory, which is the OOM scenario the guard was added to stop (#1284).
Repro on a host with ~45 GB available:
Expected behavior
The guard budgets bytes per cell from the dtype the focal internals will actually use (
_promote_floatof the input dtype): 8 bytes for float64 input, 4 otherwise.