polygonize: propagate raster CRS to GeoDataFrame output (#2149)#2154
Merged
Conversation
polygonize(raster, return_type="geopandas") returned a GeoDataFrame with crs=None even when the input DataArray carried CRS info via attrs["crs"], attrs["crs_wkt"], or rioxarray's rio.crs. Downstream spatial joins, overlays, and file writes silently lost georeferencing. A new _detect_raster_crs helper mirrors the resolution order in reproject._crs_utils._detect_source_crs (attrs first, then crs_wkt, then rio.crs) and returns the raw attribute so GeoDataFrame.set_crs handles parsing. The CRS is detected at the public API level, before backend dispatch, so all four backends (numpy / cupy / dask+numpy / dask+cupy) emit the same CRS. An unparseable CRS attribute is caught so the call never crashes -- the GeoDataFrame is returned without CRS in that case. spatialpandas does not expose a CRS slot and GeoJSON RFC 7946 is WGS84-only, so propagation lives only on the geopandas path. 8 new tests in TestPolygonizeCRSPropagation cover EPSG string and int attrs, crs_wkt, no-CRS, unparseable CRS, attrs-vs-rioxarray preference, rioxarray-only detection, and interaction with simplify_tolerance. Also updates .claude/sweep-metadata-state.csv with the 2026-05-19 polygonize audit notes.
Adds a Notes paragraph to the public polygonize() docstring describing the new GeoDataFrame CRS propagation (resolution order, what happens when the value is unparseable, why spatialpandas/geojson return types do not carry CRS). Also corrects the comment in test_crs_prefers_attrs_over_rio: rio.write_crs stores the CRS on a spatial_ref coord, not in attrs['crs']. Review feedback on #2154.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
polygonize(raster, return_type="geopandas")now propagates the raster's CRS soGeoDataFrame.crsis set when the input DataArray carries CRS viaattrs["crs"],attrs["crs_wkt"], or rioxarray'srio.crs. Before this change the output GeoDataFrame always hadcrs=None, silently breaking spatial joins, overlays, reprojections, and file writes._detect_raster_crshelper mirrors the resolution order inreproject._crs_utils._detect_source_crs(no pyproj hard dep). CRS detection runs at the public API level so all four backends (numpy / cupy / dask+numpy / dask+cupy) emit identical CRS metadata. Unparseable CRS values are swallowed so the call never crashes.Test plan
TestPolygonizeCRSPropagationclass (8 tests) covers EPSG string, EPSG int,crs_wkt, no-CRS, unparseable CRS, attrs-vs-rioxarray preference, rioxarray-only path, and simplify interaction.test_polygonize.pysuite passes: 130 passed, 13 skipped (no regressions).Closes #2149.
Dispatched by
/deep-sweep(sweep-metadata, agent worktreeagent-ad1070530d37a4fdf).