reproject: account for datum shifts in output bounds estimation#2655
Merged
Conversation
Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.
transform_points() ran its Numba projection kernels in WGS84 and skipped datum shifts, but the per-pixel data path applies a Helmert shift for the same CRS pairs. Output bounds estimated without the shift disagreed with the reprojected data by the shift magnitude -- tens to over a hundred metres for NAD27 (EPSG:4267) in CONUS, which is many pixels on a high-resolution raster. Detect datum-shift pairs via _get_datum_params and bail to pyproj, which applies the shift correctly. The fast path is unchanged for WGS84/NAD83 pairs that need no shift. Add TestDatumShiftBounds covering the fast-path bail, the no-shift fast path, boundary agreement with pyproj, and end-to-end grid bounds for a high-resolution NAD27 raster.
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: reproject: account for datum shifts in output bounds estimation
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
xrspatial/reproject/_projections.py:2128: the bail also fires when source and target share the same datum-shift datum (e.g. NAD27 to a NAD27-based projected CRS), where the net shift cancels and the fast path would have been correct anyway. A few extra pairs go through pyproj for no accuracy gain. This is the safe default and the affected pairs are rare, so leaving it is fine.
What looks good
- Root cause is correct. The fast path skipped the datum shift while the per-pixel data path (
try_numba_transform) applies it, so the bounds and the data disagreed. The fix keys on the same_get_datum_paramspredicate the data path uses, so the two now agree by construction. - The second bounds caller at
__init__.py:1796already has a pyproj fallback right after thetransform_pointscall, so it picks up the fix without changes. - Tests cover the bail, the preserved no-shift fast path, boundary agreement with pyproj, and end-to-end grid bounds for a high-resolution NAD27 raster, with a guard against regressing to the unshifted bounds.
Checklist
- Algorithm matches reference (pyproj is the reference; fast path now defers to it for datum shifts)
- Backend-independent change; no per-backend code paths
- NaN handling unchanged (pyproj fallback already filters non-finite)
- Edge cases covered (both directions, no-shift pairs, high-resolution grid)
- No dask chunk boundary concerns (bounds estimation, not per-pixel)
- No premature materialization or extra copies
- Benchmark not needed (bounds estimation runs on a handful of points)
- README feature matrix not applicable (no new function, no backend change)
- Docstring updated to explain the datum-shift bail
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (after 89be27b)
The one nit from the previous pass is addressed: added a comment at xrspatial/reproject/_projections.py:2137 noting the bail is intentionally conservative for same-datum pairs (where the shift cancels), trading a rare fast-path miss for guaranteed correctness. Comment-only change, no logic touched.
No new findings. Datum-shift tests still pass (4/4). Nothing blocking.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2649
What
transform_points()ran its Numba projection kernels in WGS84 and skipped datum shifts. The per-pixel data path (try_numba_transform) applies a Helmert shift for the same CRS pairs, so output bounds estimated without the shift disagreed with the reprojected data by the shift magnitude. For NAD27 (EPSG:4267) that is tens to over a hundred metres in CONUS, which is many pixels on a high-resolution raster._get_datum_paramsand bail to pyproj, which applies the shift. The fast path is unchanged for WGS84/NAD83 pairs that need no shift.Backend coverage
transform_pointsis the shared CRS-pair dispatcher used by the bounds estimator across all backends (numpy / cupy / dask+numpy / dask+cupy). The change is backend-independent: it routes datum-shift pairs through pyproj before any per-backend array work, so all four paths get correct bounds.Test plan
transform_pointsreturnsNone(bails to pyproj) for NAD27 <-> Web Mercator, both directions_transform_boundarymatches pyproj for NAD27 (was off by ~45 m in x before)_compute_output_gridbounds for a high-resolution NAD27 raster sit on the datum-shifted corners, not the unshifted ones