Skip to content

reproject: unit-aware bounds_policy="auto" blow-up detection#2600

Merged
brendancol merged 3 commits into
mainfrom
issue-2582
May 28, 2026
Merged

reproject: unit-aware bounds_policy="auto" blow-up detection#2600
brendancol merged 3 commits into
mainfrom
issue-2582

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Summary

Closes #2582. The bounds_policy="auto" heuristic compared source span (in source CRS units) to output span (in target CRS units), so the > 50x ratio fired on almost every geographic-to-projected reprojection and silently cropped valid edge data via the 2/98 percentile fallback. For a (-10, -10, 10, 10) EPSG:4326 bbox going to EPSG:3857, that was 70-106 km per side.

Replaced the unit-mismatched ratio with two unit-agnostic signals:

  • Magnitude ratio: max absolute projected coordinate / median absolute projected coordinate. Benign reprojections stay near 1-2. Real singularities push past 10 (Mercator near the poles after clamp) or far higher (polar-stereographic on the opposite pole projects to ~1e23).
  • Non-finite fraction: share of raw (unclamped) edge samples that produce inf/nan in the target CRS. Catches singularities that the auto clamp would otherwise hide.

Verification

The reproduction case from the issue:

Before After
auto vs raw x crop 106 km 0 m
auto vs raw y crop 70 km 0 m

Global EPSG:4326 -> EPSG:3857 still trips the percentile fallback. The pathological EPSG:4326 -> EPSG:3413 case (south pole into a north polar stereographic) still trips.

Backend coverage

  • numpy: yes
  • cupy: yes (heuristic runs on a small numpy probe, no GPU-specific path)
  • dask + numpy: yes (regression test added)
  • dask + cupy: yes (same as cupy)

Test plan

  • Regression test: 4326 -> 3857 on (-10,-10,10,10) under auto matches raw within one pixel.
  • Benign 4326 -> 3857 emits no bounds_policy warning under auto.
  • Pathological 4326 -> 3413 still trips the auto fallback and warning.
  • Existing 13 TestBoundsPolicy tests pass.
  • Full reproject suite (368 tests) passes.

The old span-ratio heuristic compared source span in source CRS units
(degrees for EPSG:4326) against output span in target CRS units
(metres for EPSG:3857). The unit mismatch made the > 50x threshold
fire on ordinary geographic-to-projected reprojections, silently
trimming 70-106 km per side on a (-10, -10, 10, 10) bbox.

Replace it with a unit-agnostic check: ratio of max absolute
projected coordinate to median absolute projected coordinate.
Benign reprojections stay near 1-2; real singularities push the
ratio past 10 (Mercator at the poles after clamp) or higher
(polar-stereographic opposite pole projects to ~1e23). Also flag
the non-finite fraction of raw, unclamped edge samples so the auto
clamp branch does not mask true singularities.

Adds four regression tests under TestBoundsPolicy covering the bug
case (numpy and dask backends), warning silence on benign input,
and that the polar-stereographic pathological case still trips.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review

Blockers

None.

Suggestions

  • xrspatial/reproject/_grid.py:271-294. The non-finite probe duplicates the edge-sampling pattern from lines 215-226, just with the unclamped source coordinates. A small helper would keep the two samplers in sync if n_edge is ever tuned. Optional given the surgical-changes rule, but flagging it.

Nits

  • xrspatial/reproject/_grid.py:299. The nonfinite_frac > 0.05 threshold is reasonable but unexplained. A one-line comment ("5% means edges hit a singularity; benign cases yield 0%") would help future maintainers understand the choice.
  • xrspatial/reproject/_grid.py:295-298. The magnitude-threshold comment mentions "ratio ~1e16" for the opposite-pole case. Worth also noting that the post-clamp Mercator-global case has a ratio around 5 (caught only by the non-finite probe), so the reader sees why both signals are needed.

What looks good

  • The bug is reproducible before the fix: 106 km x crop, 70 km y crop on (-10, -10, 10, 10) into EPSG:3857.
  • Two-signal design cleanly separates the pathological cases. The magnitude ratio catches finite-but-huge singularities like polar-stereographic on the opposite pole. The non-finite probe catches Mercator at the poles after clamp would otherwise have hidden the issue.
  • Tests cover the regression case, the silent-no-warning case, the polar-stereographic pathological case, and a dask-backed regression.
  • CHANGELOG and docs both updated.
  • All 13 existing TestBoundsPolicy tests pass; full reproject suite of 354 tests passes.
  • The non-finite probe is gated on bounds_policy == "auto" and source_crs.is_geographic so the extra pyproj call only fires when needed.

Checklist

  • Algorithm matches the issue's suggested approach
  • Backends consistent
  • NaN handling correct
  • Edge cases covered
  • Dask test added
  • No premature materialization
  • Benchmark not required
  • README not applicable
  • Docstrings updated

…2582)

- Extract _edge_samples() helper. The non-finite probe was duplicating
  the same 4-edge sampling pattern as the main edge+interior sampler;
  consolidating keeps the two callers in sync if n_edge is ever tuned.
- Add a comment explaining why the auto_blowup signal combines two
  heuristics and why the magnitude case alone is insufficient (global
  4326 -> 3857 has ratio ~5 after the clamp, well under the 10x
  threshold; the non-finite probe is what trips that case).
- Add a comment explaining the 5% non-finite threshold: benign cases
  yield 0%, real singularities yield 15%+, the floor tolerates
  one or two stray inf points from numerical noise.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review

Addressed all three findings from the previous review.

Fixed

  • Suggestion (_grid.py:271-294): extracted _edge_samples() helper; both the main edge sampler and the non-finite probe now share the same edge construction.
  • Nit (_grid.py:299): added a comment explaining why 5% is the non-finite threshold (benign cases yield 0%, real singularities yield 15%+, the floor absorbs numerical noise).
  • Nit (_grid.py:295-298): expanded the magnitude-threshold comment so the reader sees why both signals are necessary. The post-clamp global Mercator case has ratio ~5, well under 10x; the non-finite probe is what catches that case.

All 17 TestBoundsPolicy tests still pass; full reproject suite of 358 tests passes.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 28, 2026
@brendancol brendancol merged commit 41a6130 into main May 28, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reproject(bounds_policy="auto") crops ordinary geographic-to-projected reprojections

1 participant