Skip to content

resample: clamp dask interp overlap depth per axis (#2547)#2599

Merged
brendancol merged 2 commits into
mainfrom
issue-2547
May 29, 2026
Merged

resample: clamp dask interp overlap depth per axis (#2547)#2599
brendancol merged 2 commits into
mainfrom
issue-2547

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Summary

  • Clamp the cubic prefilter overlap depth per axis in _run_dask_numpy and _run_dask_cupy so dask's overlap() no longer rejects arrays smaller than depth=16.
  • Thread per-axis depths (depth_y, depth_x) through _interp_block_np / _interp_block_cupy so local coordinate translation matches the actual padding added on each axis.
  • Remove the pytest.skip that documented this hole and add TestDaskCubicSmallInput covering Nx1, 1xN, single-chunk 4x4, multi-chunk 4x4, and 8x8 inputs across both dask backends.

Backend coverage

  • numpy: unchanged (eager path)
  • cupy: unchanged (eager path)
  • dask+numpy: fixed (clamp + per-axis depth)
  • dask+cupy: fixed (clamp + per-axis depth)

Notes

On axes large enough to absorb the full depth=16, behaviour is byte-for-byte identical to before. On clamped axes the IIR boundary transient is larger than float32 epsilon, but this matches the eager kernels, which use no overlap padding at all.

Test plan

  • pytest xrspatial/tests/test_resample.py xrspatial/tests/test_resample_signature_annot_2544.py xrspatial/tests/test_resample_coverage_2026_05_27.py (261 passed)
  • Original repro from issue resample: cubic on dask backends fails for arrays smaller than depth=16 #2547 returns the same values as the eager numpy backend
  • TestDaskCubicSmallInput covers Nx1, 1xN, single-chunk 4x4, multi-chunk 4x4, and 8x8 inputs

Closes #2547

The cubic dask path passed a fixed depth of 16 to dask.overlap, which
rejects any axis whose total size is below the depth. Inputs like an
Nx1 column or a 4x4 array crashed with `ValueError: The overlapping
depth 16 is larger than your array N`.

Clamp the per-axis depth to `min(depth, max(0, axis_total - 1))` in
both `_run_dask_numpy` and `_run_dask_cupy` and thread the per-axis
values through `_interp_block_np` / `_interp_block_cupy`. Eager
backends are untouched. On axes large enough to absorb depth=16 the
behaviour is unchanged; on smaller axes the IIR boundary transient
is larger than float32 epsilon, which matches the eager kernels that
do not pad at all.

Removes the `pytest.skip` at test_resample_coverage_2026_05_27.py:75
and adds TestDaskCubicSmallInput covering Nx1, 1xN, single-chunk 4x4,
multi-chunk 4x4, and 8x8 inputs across dask+numpy and dask+cupy.

Closes #2547

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: resample: clamp dask interp overlap depth per axis (#2547)

Blockers (must fix before merge)

  • None.

Suggestions (should fix, not blocking)

  • None.

Nits (optional improvements)

  • xrspatial/resample.py:34-40: the _INTERP_DEPTH block-comment still claims the cubic dask path uses depth=16 unconditionally ("Depth 16 puts the residual at ~7e-10"). After this PR the dask path may clamp depth below 16 on small inputs. A one-line addendum that points at the new clamp in _run_dask_numpy would keep the comment honest. The constant itself is unchanged so the comment is not strictly wrong.
  • xrspatial/tests/test_resample_coverage_2026_05_27.py ((4, 4), (2, 2)) case: after _ensure_min_chunksize runs with min_size=7, a 4x4 input is rechunked into a single 4x4 chunk, so this parametrization effectively duplicates the ((4, 4), (4, 4)) single-chunk case rather than exercising the multi-chunk path. Not a defect; just noting the test does not cover what its name implies. A larger multi-chunk shape (for example ((16, 16), (4, 4))) would actually keep multiple chunks after _ensure_min_chunksize.

What looks good

  • The clamp formula min(depth, max(0, axis_total - 1)) is correct: it leaves depth unchanged whenever the axis is large enough and only reduces it for small inputs. Dask's overlap() accepts depth <= sum(chunks), so staying one below total is conservative without being lossy in the common case.
  • min_size = 2 * max(depth_y, depth_x) + 1 matches the prior contract: at least one chunk wide enough to hold the kernel's reach on either side. Memory cost is the same or lower than before.
  • The per-axis depth is threaded through _interp_block_np / _interp_block_cupy correctly. Local coordinate translation (iy - (cum_in_y[yi] - depth_y)) uses the same axis-specific depth that was passed to dask.overlap, so coordinates stay consistent.
  • The if depth_y > 0 or depth_x > 0 guard preserves the existing fast-path for nearest/bilinear (depth=1 always positive, so unchanged) and still skips the overlap call when both axes degenerate to depth 0.
  • Test coverage is appropriate: Nx1, 1xN, single-chunk 4x4, and 8x8 each exercise a distinct edge case (one-axis clamp, both-axes clamp, no overlap needed on one axis).
  • The skip removal at test_resample_coverage_2026_05_27.py:75 correctly enables all 8 method x 4 backend cases for the Nx1 column, which now pass.

Checklist

  • Algorithm matches reference (cubic prefilter behaviour unchanged on large inputs)
  • All implemented backends produce consistent results (numpy reference matches dask within float32 tolerance on small inputs; bit-for-bit on 8x1)
  • NaN handling is correct (unchanged code paths)
  • Edge cases are covered by tests (Nx1, 1xN, single-chunk, multi-chunk, depth<chunk)
  • Dask chunk boundaries handled correctly (per-axis depth matches overlap padding)
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (pure bug fix)
  • README feature matrix updated (n/a, no API change)
  • Docstrings present and accurate (no public API change)

…t case (#2547)

- Note in the _INTERP_DEPTH comment that the dask drivers clamp depth
  per axis on small inputs, so readers do not assume depth=16 is the
  active value in every dask call.
- Drop the ((4, 4), (2, 2)) parametrization in TestDaskCubicSmallInput.
  _ensure_min_chunksize collapses the clamped axis to a single chunk,
  so that case was a duplicate of the ((4, 4), (4, 4)) single-chunk
  case. Add a comment explaining why user-supplied multi-chunk layouts
  on small axes fold to a single chunk.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review

Both nits from the previous review are addressed in fc00423:

  • _INTERP_DEPTH comment now mentions the per-axis clamp so readers do not assume depth=16 is the active value in every dask call.
  • Dropped the redundant ((4, 4), (2, 2)) parametrization. _ensure_min_chunksize collapses any clamped axis to a single chunk (its min_size is 2 * (axis_total - 1) + 1 > axis_total), so a multi-chunk user layout on a small axis is indistinguishable from the single-chunk case. Added a comment explaining this so the parametrization is honest about what it tests.

No new findings. The test suite still passes (259 tests).

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 28, 2026
@brendancol brendancol merged commit 5fc9dfd into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

resample: cubic on dask backends fails for arrays smaller than depth=16

1 participant