polygonize: derive transform from x/y coords so geopandas CRS matches geometry (#2607)#2630
Merged
Merged
Conversation
) polygonize() auto-detected attrs['transform'] and rio.transform() but never the raster's own x/y coordinate values, the standard xarray / xrspatial georeferencing channel. A coords-georeferenced raster with a crs attr produced a geopandas result whose .crs claimed projected space while the geometries stayed in pixel space -- the same misalignment #2536 fixed for the transform attr. _detect_raster_transform() now falls back to deriving a rasterio-ordered 6-tuple from evenly-spaced 1-D x/y coords (origin at the first pixel corner = first_center - 0.5*spacing). The rio identity-affine case now falls through to coords instead of returning None, so coords are used when rioxarray finds no spatial dims. Explicit transform=, attrs[ 'transform'], rio.transform(), and the _xrspatial_no_georef marker keep their precedence. Detection runs before backend dispatch, so all four backends (numpy, cupy, dask+numpy, dask+cupy) behave identically. Adds TestPolygonizeCoordsTransform covering numpy/geopandas output, precedence, irregular/length-1/no-coords fallback, non-y/x dim names, the no-georef marker, and all four backends.
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: polygonize derive transform from x/y coords
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
- The identity-affine fall-through changes behavior for rasters with integer-index coords that used to get pixel-space output. With rioxarray installed, a raster carrying
coords={'y': np.arange(3), 'x': np.arange(3)}and no transform now resolves to a transform with origin -0.5 (the pixel-center to corner convention), so output spans -0.5..2.5 instead of 0..3. That is geometrically correct, but the existingtest_no_transform_info_yields_pixel_coords(xrspatial/tests/test_polygonize.py:1458) now has a misleading name and docstring: arange coords are transform info after this change, and the test only passes because its assertion uses<= 3.0with slack. Worth a one-line note in that test's docstring, or tighten its bounds to document the new -0.5 offset.
Nits (optional improvements)
- The
polygonize()docstring line at xrspatial/polygonize.py:2301 wraps mid-sentence ("This applies to all return") at 88 chars. The wrapping is pre-existing and just shifted by the edit, so it is not a new defect, but it reads awkwardly.
What looks good
- The corner-origin math is right: pixel centers map to cell corners via
first_center - 0.5*spacing, matching the existingattrs['transform']convention. - The
_coord_spacingguard rejects irregular spacing, length-1 dims, and zero-step coords, returning None instead of emitting a wrong affine from the first two points. - Precedence is preserved and tested: explicit
transform=, thenattrs['transform'], thenrio.transform(), then coords. - Uses
dims[-1]/dims[-2]for x/y instead of hardcoding'x'/'y', matchingget_dataarray_resolution; a lat/lon test covers it. - All four backends are tested (numpy, cupy, dask+numpy, dask+cupy). Detection runs before dispatch, so parity is structural.
- Descending y coords (negative spacing) are handled correctly.
Checklist
- Algorithm matches reference/convention (corner-origin transform)
- All implemented backends produce consistent results
- NaN handling unchanged (not touched by this change)
- Edge cases covered by tests (irregular, length-1, no coords, no-georef marker)
- Dask chunk boundaries handled correctly (detection is pre-dispatch)
- No premature materialization or unnecessary copies
- Benchmark not needed (detection is O(n) over a 1-D coord array)
- README feature matrix not applicable (no new function, no backend change)
- Docstrings present and accurate
- test_no_transform_info_yields_pixel_coords now builds a coordless raster instead of one with arange coords, so it exercises the genuinely-no-transform path; arange coords are a transform source since this change. - Reflow the polygonize() docstring line that wrapped mid-sentence.
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (after fixes)
Both findings from the previous review are addressed.
- Suggestion (test name/docstring):
test_no_transform_info_yields_pixel_coordsnow builds a coordless raster instead of one withnp.arange(3)coords, so it actually exercises the no-transform path. The docstring explains why, since arange coords are a transform source after this change. - Nit (docstring wrap): the
polygonize()docstring paragraph is reflowed; no more mid-sentence break.
Full polygonize suite passes locally: 172 passed, 13 skipped (the 13 skips are optional-dependency / GPU-gated cases not present in CI's matrix slot). No remaining open items.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
polygonize()auto-detected an affine transform fromattrs['transform']andrio.transform()(added in #2536) but never from the raster's own x/y coordinate values, which is the usual way an xarray raster carries its georeferencing. A raster georeferenced through coords alone, plus anattrs['crs'], produced areturn_type='geopandas'result whose.crsclaimed projected space while the geometries stayed in pixel space. It's the same CRS/coordinate mismatch as #2536, just reached through coords instead of the transform attr._detect_raster_transform()now falls back to deriving a rasterio-ordered transform from evenly-spaced 1-D x/y coords. The origin sits at the first pixel corner (first_center - 0.5 * spacing) so the pixel-corner indices from_followmap to cell corners.rio.transform()identity-affine case now falls through to coords instead of returning None, so coords get used when rioxarray does not recognize the dims as spatial (for examplelat/lon).transform=, thenattrs['transform'], thenrio.transform(), then coords. The_xrspatial_no_georefmarker still suppresses auto-detection.Backend coverage
Transform detection runs in the public
polygonize()before backend dispatch, so all four backends (numpy, cupy, dask+numpy, dask+cupy) behave the same way. Tests cover all four.Test plan
.crsagrees with geometry coordinatesattrs['transform']and explicittransform=still take precedencey/xdim names (lat/lon) georeferenced from coords_xrspatial_no_georefmarker suppressesCloses #2607