Skip to content

Reject non-binary kernels in apply() and focal_stats()#2862

Merged
brendancol merged 4 commits into
mainfrom
issue-2848
Jun 3, 2026
Merged

Reject non-binary kernels in apply() and focal_stats()#2862
brendancol merged 4 commits into
mainfrom
issue-2848

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2848

apply() and focal_stats() document the kernel as a binary membership mask ("values of 1 indicate the kernel"), but the backends disagreed on what a non-binary value meant:

  • CPU _apply_numpy only kept cells where kernel == 1, so a weight of 2 was dropped.
  • GPU _focal_sum_cuda / _focal_mean_cuda weighted every nonzero cell by its value.

Same kernel, different numbers depending on the backend.

This rejects non-binary kernels in both APIs with a clear ValueError, so all four backends agree. Per-cell weighting still belongs inside the user's func, or in convolve_2d / hotspots, which handle weighted kernels directly and are untouched. The docstrings now state the binary contract.

Backend coverage: numpy, cupy, dask+numpy, dask+cupy.

Test plan:

  • Non-binary kernels raise ValueError on every backend (cross-backend tests).
  • Binary 0/1 kernels still work on every backend.
  • Regression test pins that apply(mean) and focal_stats(mean) agree on a binary kernel.
  • Full test_focal.py passes (214 tests).

The kernel for apply() and focal_stats() is documented as a binary
membership mask ("values of 1 indicate the kernel"), but the backends
disagreed on what a non-binary value meant. The CPU path kept only cells
equal to 1 and dropped a weight of 2 entirely, while the GPU sum/mean
kernels weighted every nonzero cell by its value. The same non-binary
kernel could therefore produce different results depending on the
backend.

Validate the kernel is strictly 0/1 in both APIs and raise a clear
ValueError otherwise. Weighting belongs inside the user's func or in
convolve_2d/hotspots, which handle weighted kernels directly and are
unaffected. Update the docstrings to state the binary contract.

Add cross-backend tests covering rejection of non-binary kernels and
continued acceptance of binary kernels on numpy, cupy, dask+numpy, and
dask+cupy.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 2, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Reject non-binary kernels in apply() and focal_stats()

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • focal.py: a kernel containing NaN also gets rejected by this guard, since np.nan == 0 and np.nan == 1 are both False. That is the right call (NaN is not a valid mask value), but the error message only mentions "0 and 1 values" and does not call out NaN. Not worth a code change; flagging it so the behaviour is on record.

What looks good

  • The fix follows the documented contract ("values of 1 indicate the kernel") instead of inventing weighted semantics, so results are unchanged for any kernel that was already valid. The existing 214 tests pass as-is.
  • Validation sits in one helper called from both apply and focal_stats right after custom_kernel, so all four backends are gated at the same point. convolve_2d and hotspots, which genuinely use weighted kernels, are left alone.
  • np.all((kernel == 0) | (kernel == 1)) handles int, float, and bool kernels. circle_kernel / annulus_kernel emit float 0.0/1.0 and still pass.
  • 3D rasters route through _apply_per_band, which recurses into the public function per band, so the guard fires there too.
  • Tests exercise both APIs on numpy, cupy, dask+numpy, and dask+cupy: non-binary kernels raise, binary kernels still work, and the apply/focal_stats agreement from the issue is pinned.

Checklist

  • Algorithm matches contract: yes (binary mask)
  • All backends consistent: yes (all reject non-binary)
  • NaN handling: NaN kernel cells rejected, consistent with mask semantics
  • Edge cases covered: non-binary, mixed, and fractional kernels tested
  • Dask boundaries: unaffected (validation runs before dispatch)
  • No premature materialization: validation is a single np.all on the kernel
  • Benchmark: not needed (validation-only change)
  • README matrix: not applicable (no new function, no backend change)
  • Docstrings: updated to state the binary contract

The guard already rejects NaN kernel cells (NaN is neither 0 nor 1), but
the error message only mentioned 0 and 1. Mention NaN explicitly and add
a test for the NaN-kernel rejection path.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review

The one nit from the first pass is resolved: the binary-kernel error message now calls out NaN explicitly, and there is a new test (test_apply_rejects_nan_kernel) covering the NaN-kernel rejection path.

No new findings. No blockers or suggestions outstanding. The full focal suite still passes locally.

TestFocalLazy.test_apply and test_focal_stats passed np.ones((3,3))/9, a
weighted kernel that is now rejected by the binary-kernel guard. These
tests only assert laziness, not numerics, so switch them to a binary
np.ones((3,3)) mask. The hotspots laziness tests keep the weighted kernel
since hotspots accepts weighted kernels.
@brendancol brendancol merged commit 3f2538b into main Jun 3, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apply() and focal_stats() give backend-dependent results on non-binary kernels

1 participant