Skip to content

rasterize: validate resolution= shape and element type (#2576)#2586

Merged
brendancol merged 4 commits into
mainfrom
issue-2576
May 29, 2026
Merged

rasterize: validate resolution= shape and element type (#2576)#2586
brendancol merged 4 commits into
mainfrom
issue-2576

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Summary

  • Tighten resolution= validation in rasterize() so the parameter only accepts a scalar number or a length-2 sequence of numbers.
  • Anything else (length-0/1/3+ sequences, strings, dicts, bools, non-numeric elements) now raises a clean ValueError naming the offending input.
  • Preserves the existing finite/positive checks.

Before this fix, (1, 2, 3) was silently truncated, (1,) crashed with IndexError, 'foo' leaked a raw float() conversion error, and {'x': 1, 'y': 1} raised KeyError: 0.

Backend coverage: pure input validation that runs before any backend dispatch, so it applies uniformly across numpy, cupy, dask+numpy, dask+cupy.

Test plan

  • New tests in xrspatial/tests/test_rasterize_resolution_validation_2576.py cover the happy paths (scalar int/float, length-2 tuple/list) and the rejection paths (3-tuple, 1-tuple, empty tuple, string, dict, bool, (1.0, None), (1.0, "bar"), plus an error-message-content check).
  • Existing resolution coverage tests in test_rasterize_coverage_2026_05_17.py and test_rasterize_coverage_2026_05_21.py still pass.

Closes #2576

Before this change, the `resolution=` argument to `rasterize()` was
unpacked without checking shape or element types:

- `resolution=(1, 2, 3)` was silently truncated to its first two
  elements (incorrect output, no warning).
- `resolution=(1,)` crashed with `IndexError` from `resolution[1]`.
- `resolution='foo'` crashed with `ValueError: could not convert string
  to float: 'f'` because the string iterates character-by-character into
  `resolution[0]` / `resolution[1]`.
- `resolution={'x': 1, 'y': 1}` crashed with `KeyError: 0`.

Add an explicit guard that accepts a scalar number or a length-2
sequence of numbers, and raises a single `ValueError` naming the
offending input for everything else (length-0/1/3+ sequences, strings,
dicts, bools, non-numeric elements). The existing finite/positive
check is preserved unchanged.

Tests in `test_rasterize_resolution_validation_2576.py` pin the
clean-`ValueError` contract for the rejection paths plus a handful of
happy-path sanity checks. Bumped CHANGELOG.

Closes #2576
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 28, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: rasterize: validate resolution= shape and element type (#2576)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/rasterize.py:3217-3221: the isinstance(resolution, (int, float, tuple, list)) whitelist rejects numpy scalars (np.float32, np.int32, np.int64) and 1-D numpy arrays. Before this PR, resolution=np.array([2.0, 1.0]) and resolution=np.int32(2) worked via duck typing through float(). Numpy-typed scalars from geospatial pipelines are a realistic input. Consider widening the accepted types: add np.number to the scalar check and np.ndarray (with a 1-D / size-2 guard) to the sequence check, or fall back to a numbers.Real-based check.

  • xrspatial/tests/test_rasterize_resolution_validation_2576.py:15: import numpy as np is unused. Drop the import. Once the numpy-scalar acceptance above is applied, add tests for resolution=np.array([2.0, 1.0]) and resolution=np.float32(2.0) to pin that behaviour.

Nits (optional improvements)

  • xrspatial/rasterize.py:3215: the comment says "leaking IndexError (length-1 sequences)". The IndexError actually comes from resolution[1] on a length-1 input. Minor wording polish -- could read "length-1 sequences would crash at resolution[1]" to be precise.

What looks good

  • Validation runs after the explicit-width/height branch, so it does not slow that common path.
  • Error messages name the offending input via {resolution!r}, which is what the finding asked for.
  • bool is rejected before the int check, so resolution=True does not silently produce a 1x1 raster.
  • The finite/positive check is preserved unchanged.
  • Tests cover all six rejection cases from the finding plus happy paths and the error-message-content check.
  • CHANGELOG entry is clear and references the issue.

Checklist

  • Algorithm matches reference/paper (input-validation patch, n/a)
  • All implemented backends produce consistent results (validation runs before dispatch)
  • NaN handling is correct (preserved unchanged)
  • Edge cases are covered by tests
  • Dask chunk boundaries handled correctly (n/a)
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (n/a)
  • README feature matrix updated (n/a, no API change)
  • Docstrings present and accurate

Address review on PR #2586. The initial validator whitelisted only
`int`, `float`, `tuple`, and `list`, which rejected numpy scalars
(`np.float32`, `np.int32`, `np.int64`) and 1-D numpy arrays. Those
inputs worked via duck typing through `float()` before the validation
patch, and numpy-typed scalars are common in geospatial pipelines, so
the whitelist regressed a realistic usage pattern.

Widen the scalar check to `(int, float, np.number)` (excluding `bool`
and `np.bool_`), and widen the sequence check to also accept
`np.ndarray`. A 1-D guard rejects 2-D arrays so e.g. a `(2, 2)` array
does not slip past as a length-2 sequence.

Also tightens the comment on the `IndexError` path to specify that the
crash was at `resolution[1]` on a length-1 input (review nit).

Tests in `test_rasterize_resolution_validation_2576.py`:
- Add happy-path tests for `np.float32(2.0)`, `np.int32(2)`, and
  `np.array([2.0, 1.0])`.
- Add a 2-D array rejection test.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review pass

All findings from the previous review have been addressed in 0d5e827.

Dispositions

  • Suggestion 1 (numpy scalar / ndarray support): Fixed. The scalar check now accepts (int, float, np.number) (excluding bool and np.bool_), and the sequence check accepts (tuple, list, np.ndarray) with a 1-D guard so a 2-D array does not slip past as a length-2 sequence.
  • Suggestion 2 (unused numpy import + numpy-scalar tests): Fixed. numpy is now used by the new tests covering np.float32(2.0), np.int32(2), np.array([2.0, 1.0]), and the 2-D rejection path.
  • Nit (comment wording): Fixed. The comment now says "length-1 sequences would crash at resolution[1]".

Verification

  • 16/16 tests in test_rasterize_resolution_validation_2576.py pass.
  • 7/7 existing resolution=-related tests in test_rasterize_coverage_2026_05_17.py and test_rasterize_coverage_2026_05_21.py still pass.

No new findings on this pass.

@brendancol brendancol merged commit 93e43c8 into main May 29, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rasterize: tighten resolution= shape and type validation

1 participant