Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude/sweep-api-consistency-state.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ focal,2026-05-29,2689,HIGH,1;2;3;4,"Sweep 2026-05-29 (deep-sweep-api-consistency
geotiff,2026-05-18,2106,MEDIUM,3,"Sweep 2026-05-18 (deep-sweep-api-consistency-geotiff-2026-05-18-1779164255). 1 MEDIUM Cat 3 finding fixed in this branch: open_geotiff(max_cloud_bytes=...) was the only kwarg on the public reader/writer surface without a Python type annotation. Docstring already declared ``int or None``; the surface and the docs disagreed. Fix adds ``int | None`` to the annotation; default stays the module-internal _MAX_CLOUD_BYTES_SENTINEL. Regression test in test_open_geotiff_max_cloud_bytes_annot_2106.py pins the immediate gap and parametrises over every public reader/writer to catch future ungenerated annotations. Prior sweep findings (#1922/#1935 kwarg ordering, #2052 mask_nodata parity, #2097 GPU MinIsWhite, #2095 zero-band 3D writes, #1946 write_vrt path/vrt_path shim) all confirmed fixed. Cross-sibling return-type drift (Cat 2): write_vrt returns str while to_geotiff and write_geotiff_gpu return path which is str | BinaryIO -- inspected and still LOW (callers do not substitute writers; the return-type drift is documented in each writer's docstring). Cross-cutting cross-module drift (chunk_size in reproject vs chunks in geotiff; target_crs vs crs) documented but not filed per sweep template (cross-cutting). cuda-validated."
hydro-d8,2026-05-29,2709,HIGH,1;5,"Sweep 2026-05-29 (deep-sweep-api-consistency-hydro-d8-2026-05-29). Scope = the 13 D8-variant files only; dinf/mfd read for reference but not modified. 1 HIGH Cat 1 + 1 MEDIUM Cat 5 fixed in this branch (#2709, PR #2716). HIGH Cat 1: stream_order_d8 named its strahler/shreve selector `ordering` while sibling stream_order_dinf/stream_order_mfd use `method`; both names live in the public API and the __init__.py _StreamOrderDispatch special-cases the drift (translates ordering->method for non-d8). Fix adds `method` as an accepted alias on stream_order_d8 (case-insensitive; takes precedence; conflicting ordering+method raises ValueError), keeping `ordering` working so the out-of-scope dispatcher (passes ordering=) and existing callers are unaffected. Full rename to `method` deferred because deprecating `ordering` would warn on every stream_order(routing='d8') call via the dispatcher I cannot touch in this scope. MEDIUM Cat 5: basins_d8 (watershed_d8.py) is a backward-compat wrapper whose docstring said 'use basin instead' but emitted no warning; added DeprecationWarning(stacklevel=2). Tests added for alias parity/precedence/conflict/case-insensitivity and for the basins_d8 warning. Findings documented but NOT filed per template: (LOW Cat 1 cross-module, out of scope) dinf siblings name the first arg `flow_dir_dinf` (stream_link/flow_path/hand/watershed_dinf) while all D8 funcs use the cleaner `flow_dir`; D8 is the better convention so no D8 change -- the drift lives in the dinf files. (LOW Cat 4 defensive-validation drift) hand_d8 validates np.isfinite(threshold) but stream_link_d8/stream_order_d8 (same threshold: float = 100 param) do not; not user-facing signature surprise, document only. No Cat 2 return drift (every D8 public fn returns xr.DataArray with coords/dims/attrs preserved; Dataset in -> Dataset out via @supports_dataset). No Cat 3 missing-hints beyond fill_d8 z_limit (optional, no hint) which mirrors its sibling style. All 13 D8 funcs are re-exported in xrspatial/hydro/__init__.py (no orphan API). cuda-validated: CUDA_AVAILABLE=True on this host; method-alias parity smoke-tested on a cupy DataArray. CI: ubuntu/windows/3.12 GitHub Actions green; macOS-3.14 + ReadTheDocs slow but no failures. NOTE: the /review-pr review comment could not be posted to GitHub (auto-mode permission denial on gh pr review); review findings were applied to code instead (case-insensitive conflict check + str|None hint, commit f8467320)."
polygonize,2026-05-19,2148,HIGH,1;3,"Sweep 2026-05-19 (deep-sweep-api-consistency-polygonize-2026-05-19). 1 MEDIUM Cat 3 finding fixed in this branch (#2148): polygonize() was the only public vector/raster conversion function without a return type annotation. Sieve/contours/rasterize/clip_polygon all declare one. Fix adds a Union return annotation (numpy tuple | awkward tuple | geopandas GeoDataFrame | spatialpandas GeoDataFrame | geojson dict) using TYPE_CHECKING forward refs for optional deps, and expands the docstring Returns section to enumerate the per-return_type shapes. 1 HIGH Cat 1 finding NOT fixed in this PR -- cross-module rename: polygonize uses `connectivity` (int 4|8) while sieve uses `neighborhood` (int 4|8) for the identical rook/queen pixel-connectivity concept. Industry convention (GDAL, rasterio.features.sieve) favours `connectivity`; the deprecation shim belongs in sieve.py, not polygonize, so this is out of scope for the polygonize-scoped sweep branch. Documented here for the next sieve sweep pass. 1 LOW Cat 1 cross-cutting: polygonize/sieve/clip_polygon use `raster` while contours and many older modules use `agg` for the input DataArray -- library-wide drift, not filed per-module per sweep template. Cat 2 return-shape: polygonize returns tuple/GeoDataFrame/dict by return_type; consistent with contours' tuple/GeoDataFrame dispatch. No Cat 4 (no mutable defaults; connectivity=4 default matches sieve neighborhood=4 default). No Cat 5 (polygonize re-exported in xrspatial/__init__.py; no orphan API; no __all__ but consistent with module convention). cuda-validated: cupy backend accepts identical kwargs, smoke-tested with cupy DataArray on host with CUDA_AVAILABLE."
rasterize,2026-05-21,2250,MEDIUM,3,"Sweep 2026-05-21 (deep-sweep-api-consistency-rasterize-2026-05-21). 1 MEDIUM Cat 3 finding fixed in this branch (#2250): rasterize() was missing type annotations on geometries, columns, and merge (3 of 16 public params); the other 13 plus the return type were annotated. The docstring already declared the intended types so this was a doc-vs-signature drift. Fix annotates geometries: Any (because the accepted GeoDataFrame / dask_geopandas / iterable union spans optional deps), columns: Optional[Sequence[str]], merge: Union[str, Callable]. Regression test in test_rasterize_signature_annot_2250.py pins every param + the return annotation so a future contributor can't silently drop annotations again. Cross-module drift documented but not filed per template: clip_polygon(nodata) vs rasterize(fill) same concept different name; clip_polygon(name: Optional[str]=None) vs rasterize(name: str='rasterize') default convention; polygonize(column_name) vs rasterize(column) column selector. No Cat 1 in-module rename, no Cat 2 return drift (returns xr.DataArray as documented), no Cat 4 mutable defaults, no Cat 5 orphan API (rasterize is the only public symbol from the module and is re-exported in __init__). cuda-validated: cupy backend accepts identical kwargs, smoke-tested with use_cuda=True on host with CUDA_AVAILABLE."
rasterize,2026-06-09,3089,HIGH,1,"Sweep 2026-06-09 (deep-sweep-api-consistency-rasterize-2026-06-09). 1 HIGH Cat 1 fixed in this branch (#3089): rasterize(use_cuda=) vs open_geotiff(gpu=) named the identical GPU-backend opt-in differently; these are the only two public entry points with an explicit GPU boolean (no input array to dispatch on; both pair it with chunks= for dask) and both names were live in the public API at once. Fix renames the positional param to gpu (same slot, positional callers unaffected) and appends use_cuda=None as a deprecated alias: DeprecationWarning on use, TypeError when combined with gpu=True. Docstring, GPU merge warning text, CuPy ImportError text, and polygon_clip.py's internal dask+cupy caller updated (guarded so a legacy use_cuda in rasterize_kw does not collide with the new default); all rasterize test call sites migrated to gpu=; regression tests in test_rasterize_gpu_alias_3089.py pin slot position, warning, TypeError, backend parity, and the warning-free clip_polygon path. Re-inspection after the 2026-05-21 pass (#2250); prior cross-module notes (clip_polygon nodata vs fill, name default drift, polygonize column_name vs column) still documented-only. Docstring/signature parity verified programmatically (17/17 params, order matches). New params since last pass (check_crs, max_pixels) consistent with geotiff naming (max_pixels matches geotiff's). No Cat 2/4/5 findings. LOW noted, not fixed (other module's docs): docs/source/user_guide/focal.ipynb claims convolve_2d takes use_cuda, which it does not. cuda-validated: CUDA_AVAILABLE=True; numpy/cupy/dask+numpy/dask+cupy smoke-tested with identical kwargs, values equal."
reproject,2026-05-29,2613,MEDIUM,1,"Sweep 2026-05-29 (deep-sweep-api-consistency-reproject-2026-05-29). 1 MEDIUM Cat 1 finding fixed in this branch (#2613, PR #2626): reproject() spelled the source/target concept two ways in one signature -- source_crs/target_crs (full words) for horizontal CRS but src_vertical_crs/tgt_vertical_crs (abbreviated) for the vertical datum. Renamed the vertical kwargs to source_vertical_crs/target_vertical_crs with a deprecation shim: old names still accepted, emit DeprecationWarning, and passing both old+new for one side raises TypeError. Docstring updated; existing vertical-shift tests migrated to new names; added back-compat + conflict tests. Verified on numpy AND cupy entry points (shared signature; backend dispatch is internal). Other findings documented but NOT filed per template: (LOW Cat 1) itrf_transform(src=/tgt=) uses abbreviated keyword-only names for ITRF frame names vs source_crs/target_crs elsewhere -- separate function family (frames, not CRS), left as-is. (LOW cross-cutting Cat 1) first-arg `raster` (reproject)/`rasters` (merge) vs `agg` in terrain modules -- library-wide drift, not per-module. Prior #1570 vertical_crs EPSG-int collision confirmed still fixed. No Cat 2 return drift (reproject/merge both return DataArray as documented; geoid_height scalar/array and itrf_transform tuple are distinct families). No Cat 4 default drift (resampling/transform_precision/chunk_size/bounds_policy/model defaults consistent across siblings). No Cat 5 orphan API (itrf_frames is list_frames aliased in __all__; vertical/itrf funcs namespaced under xrspatial.reproject like geotiff's funcs). cuda-validated: CUDA_AVAILABLE=True on this host."
resample,2026-05-27,2544,MEDIUM,3,"Sweep 2026-05-27 (deep-sweep-api-consistency-resample-2026-05-27). 1 MEDIUM Cat 3 finding fixed in this branch (#2544): resample() was the only public symbol in xrspatial.resample without type annotations on any parameter or return; siblings slope/aspect/hillshade/curvature all annotate `agg: xr.DataArray` and `-> xr.DataArray`. Fix adds annotations matching the docstring (agg: xr.DataArray; scale_factor / target_resolution: float | tuple[float, float] | None; method: str; nodata: float | None; name: str) and a `-> xr.DataArray` return type, plus a docstring note that the @supports_dataset decorator accepts Dataset too. Regression test test_resample_signature_annot_2544.py pins every param and the return annotation. Other findings documented but not filed per template: (MEDIUM Cat 1 cross-module) `method` (resample) vs `resampling` (reproject/merge) -- same conceptual parameter, different name, cross-cutting rename, needs design issue. (LOW Cat 1 cross-cutting) first-arg `agg` (resample/slope/aspect/...) vs `raster` (reproject/rasterize/polygonize/sieve) -- library-wide drift, not per-module. (LOW Cat 5) ALL_METHODS imported by tests but not in __all__ (module has no __all__); borderline orphan but used for test parametrisation only. No Cat 2 (returns xr.DataArray as documented). No Cat 4 mutable defaults. resample is exported in xrspatial/__init__.py. cuda-validated: cupy backend smoke-tested with nearest, bilinear, and average on host with CUDA_AVAILABLE=True."
slope,2026-05-29,2681,MEDIUM,3,"Sweep 2026-05-29 (deep-sweep-api-consistency-slope-2026-05-29). 1 MEDIUM Cat 3 finding fixed in this branch (#2681, PR #2687): slope() annotated name as `str` while every terrain-family sibling (aspect/northness/eastness in aspect.py, curvature in curvature.py) uses Optional[str]. name flows into xr.DataArray(name=name) which accepts None, so slope(agg, name=None) already worked at runtime -- the annotation was just wrong and inconsistent. Fix widens to Optional[str] and imports Optional (module previously imported only Union). Non-breaking (type-hint widening), no deprecation shim. Added test_name_annotation_matches_terrain_family (pins parity vs the 4 siblings via get_type_hints, unwrapping @supports_dataset) and test_name_none_accepted (slope(agg, name=None).name is None). Full test_slope.py passes (43). No backend logic touched -- numpy/cupy/dask+numpy/dask+cupy paths unchanged; public signature is shared across backends via ArrayTypeFunctionMapping. Other categories: no Cat 1 in-module rename (slope/aspect share identical public param names agg/name/method/z_unit/boundary); no Cat 2 return drift (returns xr.DataArray/Dataset via @supports_dataset, same coords/dims/attrs convention as siblings); no Cat 4 default drift (name/method='planar'/z_unit='meter'/boundary='nan' match across the family); no Cat 5 orphan API (slope re-exported in __init__.py, documented, no __all__ but consistent with module convention). Cross-cutting (documented, not filed per template): first-arg `agg` (slope/aspect/curvature) vs `raster` (reproject/rasterize/polygonize) is library-wide drift. cuda-validated: CUDA_AVAILABLE=True on this host; cupy slope smoke-tested (planar) and signature parity confirmed between numpy and cupy entry points."
Expand Down
24 changes: 12 additions & 12 deletions benchmarks/benchmarks/rasterize.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,11 @@ def setup(self, nx, backend):
self.bounds = (-180, -90, 180, 90)
self.width = nx
self.height = ny
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_polygons(self, nx, backend):
rasterize(self.pairs, width=self.width, height=self.height,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)


class RasterizeComplexPolygons:
Expand All @@ -118,11 +118,11 @@ def setup(self, nx, backend):
self.bounds = (-180, -90, 180, 90)
self.width = nx
self.height = ny
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_complex_polygons(self, nx, backend):
rasterize(self.pairs, width=self.width, height=self.height,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)


# -------------------------------------------------------------------------
Expand All @@ -142,11 +142,11 @@ def setup(self, nx, backend):
self.bounds = (-180, -90, 180, 90)
self.width = nx
self.height = ny
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_lines(self, nx, backend):
rasterize(self.pairs, width=self.width, height=self.height,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)


# -------------------------------------------------------------------------
Expand All @@ -166,11 +166,11 @@ def setup(self, nx, backend):
self.bounds = (-180, -90, 180, 90)
self.width = nx
self.height = ny
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_points(self, nx, backend):
rasterize(self.pairs, width=self.width, height=self.height,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)


# -------------------------------------------------------------------------
Expand All @@ -195,11 +195,11 @@ def setup(self, nx, backend):
self.bounds = (-180, -90, 180, 90)
self.width = nx
self.height = ny
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_mixed(self, nx, backend):
rasterize(self.pairs, width=self.width, height=self.height,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)


# -------------------------------------------------------------------------
Expand Down Expand Up @@ -248,8 +248,8 @@ def setup(self, n_polys, backend):
vals.append(float(i + 1))
self.pairs = list(zip(geoms, vals))
self.bounds = (-180, -90, 180, 90)
self.use_cuda = (backend == "cupy")
self.gpu = (backend == "cupy")

def time_rasterize_scaling(self, n_polys, backend):
rasterize(self.pairs, width=1000, height=500,
bounds=self.bounds, fill=0, use_cuda=self.use_cuda)
bounds=self.bounds, fill=0, gpu=self.gpu)
6 changes: 5 additions & 1 deletion xrspatial/polygon_clip.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,11 @@ def clip_polygon(
rc, cc = raster.data.chunks[-2], raster.data.chunks[-1]
kw.setdefault('chunks', (rc[0], cc[0]))
if has_cuda_and_cupy() and is_dask_cupy(raster):
kw.setdefault('use_cuda', True)
# Respect a legacy ``use_cuda`` passed via rasterize_kw --
# defaulting ``gpu`` as well would make rasterize() see both
# names and raise.
if 'use_cuda' not in kw:
kw.setdefault('gpu', True)

mask = rasterize(geom_pairs, **kw)

Expand Down
Loading
Loading