Skip to content

Fix bounded dask GREAT_CIRCLE proximity missing antimeridian targets (#3108)#3130

Merged
brendancol merged 4 commits into
mainfrom
deep-sweep-accuracy-proximity-2026-06-09
Jun 10, 2026
Merged

Fix bounded dask GREAT_CIRCLE proximity missing antimeridian targets (#3108)#3130
brendancol merged 4 commits into
mainfrom
deep-sweep-accuracy-proximity-2026-06-09

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3108.

Bounded proximity/allocation/direction with distance_metric='GREAT_CIRCLE' returned NaN on dask for targets across the +/-180 seam, while numpy and cupy found them. The map_overlap halo was sized as a linear sum of per-column parallel-arc steps, but great-circle distance is periodic in longitude and its chords shorten toward the poles, so array-space adjacency is not a lower bound on spherical distance.

  • _halo_depth now derives the GREAT_CIRCLE column halo from the chord bound dist >= 2R asin(cos(lat_max) |sin(dlon/2)|), which holds for every pair of grid points.
  • When the seam gap (or the 180-degree chord at the worst-case latitude) is within max_distance, no array-space halo can cover the wrap; the returned depth exceeds the axis length so the existing _fit_halo_to_chunks folds the x axis into a single chunk. This also covers over-pole shortcuts.
  • The row halo stays linear: great-circle distance is never smaller than the meridian separation, so the per-row step sum remains a valid lower bound.
  • Regional rasters away from the seam and poles keep a small finite halo (no needless fold).

Backend coverage: the fix is in _halo_depth, shared by the bounded dask+numpy and dask+cupy paths. numpy and cupy were already correct (whole-raster brute force).

Test plan:

  • New regression test: global 1-degree raster spanning the antimeridian, all three ops, dask+numpy and dask+cupy match numpy (wrap pixel finite at ~111319.49 m, previously NaN).
  • New unit test: _halo_depth folds on wrap-reachable and pole-adjacent rasters, returns a finite halo for a regional raster.
  • Full test_proximity.py suite: 417 passed on a CUDA host (cupy and dask+cupy parametrizations executed).

…3108)

The map_overlap halo was sized as a linear sum of per-column parallel-arc
steps, but great-circle distance is periodic in longitude and its chords
shorten toward the poles, so array-space adjacency is not a lower bound on
spherical distance. Derive the column halo from the chord bound
2R asin(cos(lat_max) |sin(dlon/2)|) and fold the x axis into a single chunk
when the +/-180 seam or the 180-degree chord at worst-case latitude is
within max_distance. Covers dask+numpy and dask+cupy (shared _halo_depth).

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Fix bounded dask GREAT_CIRCLE proximity missing antimeridian targets (#3108)

Blockers (must fix before merge)

  • None found. The chord bound is valid for every grid-point pair (both latitudes are bounded by lat_max, and haversine's |sin(dlon/2)| takes the short way around), and the non-fold branch is sound: when seam_gap > dlon_max, every wrap-side separation is at least 360 - span = seam_gap, so only direct separations matter and ceil(dlon_max / min_step) covers them.

Suggestions (should fix, not blocking)

  • xrspatial/tests/test_proximity.py (new test_great_circle_halo_folds_on_wrap_and_pole): all three cases use ascending lon coords. _great_circle_col_halo relies on abs() for the span and np.abs(np.diff(...)) for the step, which should make descending coords work, but nothing pins that. Add a descending-lon case (the module explicitly supports descending monotonic coords; see test_descending_coords_allowed).

Nits (optional improvements)

  • xrspatial/tests/test_proximity.py: _antimeridian_raster returns (raster, data) but both callers only use raster (the dask test re-reads raster.data). Return just the raster.
  • xrspatial/proximity.py:313: radius = 6378137.0 is the fourth copy of this constant in the module (great_circle_distance default, a comment, and the GPU device function). Consistent with existing style, so optional, but a module-level constant would keep them from drifting.

What looks good

  • The fix lands in _halo_depth, which both bounded dask paths share, so dask+numpy and dask+cupy are covered by one change.
  • Fold is signalled as width + 1, reusing _fit_halo_to_chunks instead of adding a second fold mechanism; a single-chunk axis folds cleanly to depth 0 rather than padding a full chunk width of NaN.
  • The row halo is left linear with a comment explaining why that remains a valid lower bound (meridian separation), so the change is scoped to where the old bound was actually wrong.
  • The regression test asserts the wrap pixel is finite on numpy before comparing, so it cannot pass with both backends returning NaN.
  • 417 tests pass on a CUDA host with the cupy and dask+cupy parametrizations executed.

Checklist

  • Algorithm matches reference (chord lower bound derived and checked)
  • All implemented backends produce consistent results (verified on CUDA host)
  • NaN handling is correct (unchanged; boundary NaN padding still excluded via isfinite)
  • Edge cases covered (wrap, pole-adjacent, regional no-fold; max_distance=0 yields pad 0 via dlon_max=0)
  • Dask chunk boundaries handled correctly (fold guarantees full-axis visibility)
  • No premature materialization (halo math uses only 1D coord arrays; graph stays lazy)
  • Benchmark exists (benchmarks/benchmarks/proximity.py); no new function added
  • README feature matrix: not applicable (no new function, no backend change)
  • Docstrings present and accurate (_great_circle_col_halo, updated _halo_depth)

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after 90c9cca)

Delta-only re-review of the follow-up commit:

  • Descending-coordinate coverage added to test_great_circle_halo_folds_on_wrap_and_pole: regional case asserts pad equality with the ascending raster, and a descending wrap raster still folds. Verified _great_circle_col_halo is direction-independent (span via abs(), step via np.abs(np.diff(...))), so the assertions pin real behavior.
  • _antimeridian_raster now returns just the raster; both callers updated.

Disposition of the original findings:

  • Suggestion (descending-lon coverage): fixed.
  • Nit (unused helper return): fixed.
  • Nit (shared Earth-radius constant): dismissed. Unifying the constant would mean editing the public great_circle_distance signature and the CUDA device function, code this PR does not otherwise touch; the new code matches the module's existing style of a local literal.

No new findings. 48 great-circle/halo tests pass locally; flake8 clean on the changed files.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 10, 2026
…roximity-2026-06-09

# Conflicts:
#	.claude/sweep-accuracy-state.csv
@brendancol brendancol merged commit f1535ee into main Jun 10, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bounded dask GREAT_CIRCLE proximity misses targets across the antimeridian

1 participant