resample: clamp dask interp overlap depth per axis (#2547)#2599
Merged
Conversation
The cubic dask path passed a fixed depth of 16 to dask.overlap, which rejects any axis whose total size is below the depth. Inputs like an Nx1 column or a 4x4 array crashed with `ValueError: The overlapping depth 16 is larger than your array N`. Clamp the per-axis depth to `min(depth, max(0, axis_total - 1))` in both `_run_dask_numpy` and `_run_dask_cupy` and thread the per-axis values through `_interp_block_np` / `_interp_block_cupy`. Eager backends are untouched. On axes large enough to absorb depth=16 the behaviour is unchanged; on smaller axes the IIR boundary transient is larger than float32 epsilon, which matches the eager kernels that do not pad at all. Removes the `pytest.skip` at test_resample_coverage_2026_05_27.py:75 and adds TestDaskCubicSmallInput covering Nx1, 1xN, single-chunk 4x4, multi-chunk 4x4, and 8x8 inputs across dask+numpy and dask+cupy. Closes #2547
brendancol
commented
May 28, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: resample: clamp dask interp overlap depth per axis (#2547)
Blockers (must fix before merge)
- None.
Suggestions (should fix, not blocking)
- None.
Nits (optional improvements)
xrspatial/resample.py:34-40: the_INTERP_DEPTHblock-comment still claims the cubic dask path uses depth=16 unconditionally ("Depth 16 puts the residual at ~7e-10"). After this PR the dask path may clamp depth below 16 on small inputs. A one-line addendum that points at the new clamp in_run_dask_numpywould keep the comment honest. The constant itself is unchanged so the comment is not strictly wrong.xrspatial/tests/test_resample_coverage_2026_05_27.py((4, 4), (2, 2))case: after_ensure_min_chunksizeruns withmin_size=7, a 4x4 input is rechunked into a single 4x4 chunk, so this parametrization effectively duplicates the((4, 4), (4, 4))single-chunk case rather than exercising the multi-chunk path. Not a defect; just noting the test does not cover what its name implies. A larger multi-chunk shape (for example((16, 16), (4, 4))) would actually keep multiple chunks after_ensure_min_chunksize.
What looks good
- The clamp formula
min(depth, max(0, axis_total - 1))is correct: it leaves depth unchanged whenever the axis is large enough and only reduces it for small inputs. Dask'soverlap()acceptsdepth <= sum(chunks), so staying one below total is conservative without being lossy in the common case. min_size = 2 * max(depth_y, depth_x) + 1matches the prior contract: at least one chunk wide enough to hold the kernel's reach on either side. Memory cost is the same or lower than before.- The per-axis depth is threaded through
_interp_block_np/_interp_block_cupycorrectly. Local coordinate translation (iy - (cum_in_y[yi] - depth_y)) uses the same axis-specific depth that was passed todask.overlap, so coordinates stay consistent. - The
if depth_y > 0 or depth_x > 0guard preserves the existing fast-path fornearest/bilinear(depth=1 always positive, so unchanged) and still skips the overlap call when both axes degenerate to depth 0. - Test coverage is appropriate: Nx1, 1xN, single-chunk 4x4, and 8x8 each exercise a distinct edge case (one-axis clamp, both-axes clamp, no overlap needed on one axis).
- The skip removal at
test_resample_coverage_2026_05_27.py:75correctly enables all 8 method x 4 backend cases for the Nx1 column, which now pass.
Checklist
- Algorithm matches reference (cubic prefilter behaviour unchanged on large inputs)
- All implemented backends produce consistent results (numpy reference matches dask within float32 tolerance on small inputs; bit-for-bit on 8x1)
- NaN handling is correct (unchanged code paths)
- Edge cases are covered by tests (Nx1, 1xN, single-chunk, multi-chunk, depth<chunk)
- Dask chunk boundaries handled correctly (per-axis depth matches overlap padding)
- No premature materialization or unnecessary copies
- Benchmark exists or is not needed (pure bug fix)
- README feature matrix updated (n/a, no API change)
- Docstrings present and accurate (no public API change)
…t case (#2547) - Note in the _INTERP_DEPTH comment that the dask drivers clamp depth per axis on small inputs, so readers do not assume depth=16 is the active value in every dask call. - Drop the ((4, 4), (2, 2)) parametrization in TestDaskCubicSmallInput. _ensure_min_chunksize collapses the clamped axis to a single chunk, so that case was a duplicate of the ((4, 4), (4, 4)) single-chunk case. Add a comment explaining why user-supplied multi-chunk layouts on small axes fold to a single chunk.
brendancol
commented
May 28, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review
Both nits from the previous review are addressed in fc00423:
_INTERP_DEPTHcomment now mentions the per-axis clamp so readers do not assume depth=16 is the active value in every dask call.- Dropped the redundant
((4, 4), (2, 2))parametrization._ensure_min_chunksizecollapses any clamped axis to a single chunk (itsmin_sizeis2 * (axis_total - 1) + 1 > axis_total), so a multi-chunk user layout on a small axis is indistinguishable from the single-chunk case. Added a comment explaining this so the parametrization is honest about what it tests.
No new findings. The test suite still passes (259 tests).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_run_dask_numpyand_run_dask_cupyso dask'soverlap()no longer rejects arrays smaller than depth=16.depth_y,depth_x) through_interp_block_np/_interp_block_cupyso local coordinate translation matches the actual padding added on each axis.pytest.skipthat documented this hole and addTestDaskCubicSmallInputcovering Nx1, 1xN, single-chunk 4x4, multi-chunk 4x4, and 8x8 inputs across both dask backends.Backend coverage
Notes
On axes large enough to absorb the full depth=16, behaviour is byte-for-byte identical to before. On clamped axes the IIR boundary transient is larger than float32 epsilon, but this matches the eager kernels, which use no overlap padding at all.
Test plan
pytest xrspatial/tests/test_resample.py xrspatial/tests/test_resample_signature_annot_2544.py xrspatial/tests/test_resample_coverage_2026_05_27.py(261 passed)TestDaskCubicSmallInputcovers Nx1, 1xN, single-chunk 4x4, multi-chunk 4x4, and 8x8 inputsCloses #2547