Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude/sweep-api-consistency-state.csv
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ geotiff,2026-05-18,2106,MEDIUM,3,"Sweep 2026-05-18 (deep-sweep-api-consistency-g
hydro-d8,2026-05-29,2709,HIGH,1;5,"Sweep 2026-05-29 (deep-sweep-api-consistency-hydro-d8-2026-05-29). Scope = the 13 D8-variant files only; dinf/mfd read for reference but not modified. 1 HIGH Cat 1 + 1 MEDIUM Cat 5 fixed in this branch (#2709, PR #2716). HIGH Cat 1: stream_order_d8 named its strahler/shreve selector `ordering` while sibling stream_order_dinf/stream_order_mfd use `method`; both names live in the public API and the __init__.py _StreamOrderDispatch special-cases the drift (translates ordering->method for non-d8). Fix adds `method` as an accepted alias on stream_order_d8 (case-insensitive; takes precedence; conflicting ordering+method raises ValueError), keeping `ordering` working so the out-of-scope dispatcher (passes ordering=) and existing callers are unaffected. Full rename to `method` deferred because deprecating `ordering` would warn on every stream_order(routing='d8') call via the dispatcher I cannot touch in this scope. MEDIUM Cat 5: basins_d8 (watershed_d8.py) is a backward-compat wrapper whose docstring said 'use basin instead' but emitted no warning; added DeprecationWarning(stacklevel=2). Tests added for alias parity/precedence/conflict/case-insensitivity and for the basins_d8 warning. Findings documented but NOT filed per template: (LOW Cat 1 cross-module, out of scope) dinf siblings name the first arg `flow_dir_dinf` (stream_link/flow_path/hand/watershed_dinf) while all D8 funcs use the cleaner `flow_dir`; D8 is the better convention so no D8 change -- the drift lives in the dinf files. (LOW Cat 4 defensive-validation drift) hand_d8 validates np.isfinite(threshold) but stream_link_d8/stream_order_d8 (same threshold: float = 100 param) do not; not user-facing signature surprise, document only. No Cat 2 return drift (every D8 public fn returns xr.DataArray with coords/dims/attrs preserved; Dataset in -> Dataset out via @supports_dataset). No Cat 3 missing-hints beyond fill_d8 z_limit (optional, no hint) which mirrors its sibling style. All 13 D8 funcs are re-exported in xrspatial/hydro/__init__.py (no orphan API). cuda-validated: CUDA_AVAILABLE=True on this host; method-alias parity smoke-tested on a cupy DataArray. CI: ubuntu/windows/3.12 GitHub Actions green; macOS-3.14 + ReadTheDocs slow but no failures. NOTE: the /review-pr review comment could not be posted to GitHub (auto-mode permission denial on gh pr review); review findings were applied to code instead (case-insensitive conflict check + str|None hint, commit f8467320)."
polygonize,2026-05-19,2148,HIGH,1;3,"Sweep 2026-05-19 (deep-sweep-api-consistency-polygonize-2026-05-19). 1 MEDIUM Cat 3 finding fixed in this branch (#2148): polygonize() was the only public vector/raster conversion function without a return type annotation. Sieve/contours/rasterize/clip_polygon all declare one. Fix adds a Union return annotation (numpy tuple | awkward tuple | geopandas GeoDataFrame | spatialpandas GeoDataFrame | geojson dict) using TYPE_CHECKING forward refs for optional deps, and expands the docstring Returns section to enumerate the per-return_type shapes. 1 HIGH Cat 1 finding NOT fixed in this PR -- cross-module rename: polygonize uses `connectivity` (int 4|8) while sieve uses `neighborhood` (int 4|8) for the identical rook/queen pixel-connectivity concept. Industry convention (GDAL, rasterio.features.sieve) favours `connectivity`; the deprecation shim belongs in sieve.py, not polygonize, so this is out of scope for the polygonize-scoped sweep branch. Documented here for the next sieve sweep pass. 1 LOW Cat 1 cross-cutting: polygonize/sieve/clip_polygon use `raster` while contours and many older modules use `agg` for the input DataArray -- library-wide drift, not filed per-module per sweep template. Cat 2 return-shape: polygonize returns tuple/GeoDataFrame/dict by return_type; consistent with contours' tuple/GeoDataFrame dispatch. No Cat 4 (no mutable defaults; connectivity=4 default matches sieve neighborhood=4 default). No Cat 5 (polygonize re-exported in xrspatial/__init__.py; no orphan API; no __all__ but consistent with module convention). cuda-validated: cupy backend accepts identical kwargs, smoke-tested with cupy DataArray on host with CUDA_AVAILABLE."
rasterize,2026-05-21,2250,MEDIUM,3,"Sweep 2026-05-21 (deep-sweep-api-consistency-rasterize-2026-05-21). 1 MEDIUM Cat 3 finding fixed in this branch (#2250): rasterize() was missing type annotations on geometries, columns, and merge (3 of 16 public params); the other 13 plus the return type were annotated. The docstring already declared the intended types so this was a doc-vs-signature drift. Fix annotates geometries: Any (because the accepted GeoDataFrame / dask_geopandas / iterable union spans optional deps), columns: Optional[Sequence[str]], merge: Union[str, Callable]. Regression test in test_rasterize_signature_annot_2250.py pins every param + the return annotation so a future contributor can't silently drop annotations again. Cross-module drift documented but not filed per template: clip_polygon(nodata) vs rasterize(fill) same concept different name; clip_polygon(name: Optional[str]=None) vs rasterize(name: str='rasterize') default convention; polygonize(column_name) vs rasterize(column) column selector. No Cat 1 in-module rename, no Cat 2 return drift (returns xr.DataArray as documented), no Cat 4 mutable defaults, no Cat 5 orphan API (rasterize is the only public symbol from the module and is re-exported in __init__). cuda-validated: cupy backend accepts identical kwargs, smoke-tested with use_cuda=True on host with CUDA_AVAILABLE."
reproject,2026-05-29,2613,MEDIUM,1,"Sweep 2026-05-29 (deep-sweep-api-consistency-reproject-2026-05-29). 1 MEDIUM Cat 1 finding fixed in this branch (#2613, PR #2626): reproject() spelled the source/target concept two ways in one signature -- source_crs/target_crs (full words) for horizontal CRS but src_vertical_crs/tgt_vertical_crs (abbreviated) for the vertical datum. Renamed the vertical kwargs to source_vertical_crs/target_vertical_crs with a deprecation shim: old names still accepted, emit DeprecationWarning, and passing both old+new for one side raises TypeError. Docstring updated; existing vertical-shift tests migrated to new names; added back-compat + conflict tests. Verified on numpy AND cupy entry points (shared signature; backend dispatch is internal). Other findings documented but NOT filed per template: (LOW Cat 1) itrf_transform(src=/tgt=) uses abbreviated keyword-only names for ITRF frame names vs source_crs/target_crs elsewhere -- separate function family (frames, not CRS), left as-is. (LOW cross-cutting Cat 1) first-arg `raster` (reproject)/`rasters` (merge) vs `agg` in terrain modules -- library-wide drift, not per-module. Prior #1570 vertical_crs EPSG-int collision confirmed still fixed. No Cat 2 return drift (reproject/merge both return DataArray as documented; geoid_height scalar/array and itrf_transform tuple are distinct families). No Cat 4 default drift (resampling/transform_precision/chunk_size/bounds_policy/model defaults consistent across siblings). No Cat 5 orphan API (itrf_frames is list_frames aliased in __all__; vertical/itrf funcs namespaced under xrspatial.reproject like geotiff's funcs). cuda-validated: CUDA_AVAILABLE=True on this host."
reproject,2026-06-09,3095;3097,HIGH,1;2;3,"Sweep 2026-06-09 (deep-sweep-api-consistency-reproject-2026-06-09). 2 findings filed and fixed: #3095 -> PR #3125, #3097 -> PR #3134 (branches -01/-02 off this one). (HIGH Cat 2, #3095) merge() raises TypeError ('Implicit conversion to a NumPy array is not allowed') on cupy-backed inputs while sibling reproject() supports numpy/cupy/dask+numpy/dask+cupy; crash site _merge_inmemory info['raster'].values (__init__.py:2572); dask-of-cupy fails the same way at compute via _merge_block_adapter -> _reproject_chunk_numpy/np.asarray. _merge.py has a complete _merge_arrays_cupy that is imported in __init__.py:38 but never called (dead GPU plumbing; the unused import alone is lint issue #3083 from the style sweep). Fix: host round-trip on entry (same pattern as _apply_vertical_shift), GPU result out, docstring documents backend handling. (MEDIUM Cat 3, #3097) _vertical.py Returns docstrings claim 'same type as input/height' but geoid_height(DataArray) returns np.ndarray (verified empirically) and the four conversion wrappers return np.float64/np.ndarray; geoid_height converts scalars to Python float but the wrappers do not (sibling scalar-return drift). Docs-only fix. Documented but NOT fixed: (LOW Cat 1) itrf_transform(src=/tgt=) abbreviations vs source_/target_ elsewhere -- prior 2026-05-29 sweep already weighed this and left it as-is (frames, not CRSes); filed #3099 before noticing the prior disposition, then closed it as not-planned to avoid churn. (LOW Cat 5) module docstring 'Public API' section lists only reproject/merge while __all__ exports 10 names (vertical+itrf funcs invisible in help() header; docs/source/reference/reproject.rst autosummary likewise lists only reproject/merge). Cross-cutting, notes only per template: raster/rasters (reproject) vs agg (terrain family) vs source (geotiff); chunk_size (reproject/merge) vs chunks (open_geotiff); resampling+resolution (reproject/merge/accessor) vs method+target_resolution (resample.py -- resample is the outlier, belongs to a resample-module pass, already in resample row's notes). No Cat 4 default drift (resampling='bilinear'/transform_precision=16/chunk_size=None/bounds_policy='auto'/model='EGM96' consistent across siblings). reproject()/merge() kwarg parity smoke-tested on numpy AND cupy DataArrays (merge cupy crash found exactly there). cuda-validated: CUDA_AVAILABLE=True on this host. CI: all GitHub Actions checks green on both PRs; RTD flapped (pending on #3125, fail on #3134 -- repo-wide backlog, change not docs-rendered); PRs left BLOCKED on REVIEW_REQUIRED for the user to merge."
resample,2026-05-27,2544,MEDIUM,3,"Sweep 2026-05-27 (deep-sweep-api-consistency-resample-2026-05-27). 1 MEDIUM Cat 3 finding fixed in this branch (#2544): resample() was the only public symbol in xrspatial.resample without type annotations on any parameter or return; siblings slope/aspect/hillshade/curvature all annotate `agg: xr.DataArray` and `-> xr.DataArray`. Fix adds annotations matching the docstring (agg: xr.DataArray; scale_factor / target_resolution: float | tuple[float, float] | None; method: str; nodata: float | None; name: str) and a `-> xr.DataArray` return type, plus a docstring note that the @supports_dataset decorator accepts Dataset too. Regression test test_resample_signature_annot_2544.py pins every param and the return annotation. Other findings documented but not filed per template: (MEDIUM Cat 1 cross-module) `method` (resample) vs `resampling` (reproject/merge) -- same conceptual parameter, different name, cross-cutting rename, needs design issue. (LOW Cat 1 cross-cutting) first-arg `agg` (resample/slope/aspect/...) vs `raster` (reproject/rasterize/polygonize/sieve) -- library-wide drift, not per-module. (LOW Cat 5) ALL_METHODS imported by tests but not in __all__ (module has no __all__); borderline orphan but used for test parametrisation only. No Cat 2 (returns xr.DataArray as documented). No Cat 4 mutable defaults. resample is exported in xrspatial/__init__.py. cuda-validated: cupy backend smoke-tested with nearest, bilinear, and average on host with CUDA_AVAILABLE=True."
slope,2026-05-29,2681,MEDIUM,3,"Sweep 2026-05-29 (deep-sweep-api-consistency-slope-2026-05-29). 1 MEDIUM Cat 3 finding fixed in this branch (#2681, PR #2687): slope() annotated name as `str` while every terrain-family sibling (aspect/northness/eastness in aspect.py, curvature in curvature.py) uses Optional[str]. name flows into xr.DataArray(name=name) which accepts None, so slope(agg, name=None) already worked at runtime -- the annotation was just wrong and inconsistent. Fix widens to Optional[str] and imports Optional (module previously imported only Union). Non-breaking (type-hint widening), no deprecation shim. Added test_name_annotation_matches_terrain_family (pins parity vs the 4 siblings via get_type_hints, unwrapping @supports_dataset) and test_name_none_accepted (slope(agg, name=None).name is None). Full test_slope.py passes (43). No backend logic touched -- numpy/cupy/dask+numpy/dask+cupy paths unchanged; public signature is shared across backends via ArrayTypeFunctionMapping. Other categories: no Cat 1 in-module rename (slope/aspect share identical public param names agg/name/method/z_unit/boundary); no Cat 2 return drift (returns xr.DataArray/Dataset via @supports_dataset, same coords/dims/attrs convention as siblings); no Cat 4 default drift (name/method='planar'/z_unit='meter'/boundary='nan' match across the family); no Cat 5 orphan API (slope re-exported in __init__.py, documented, no __all__ but consistent with module convention). Cross-cutting (documented, not filed per template): first-arg `agg` (slope/aspect/curvature) vs `raster` (reproject/rasterize/polygonize) is library-wide drift. cuda-validated: CUDA_AVAILABLE=True on this host; cupy slope smoke-tested (planar) and signature parity confirmed between numpy and cupy entry points."
zonal,2026-05-27,2521,HIGH,1;3;5,"Sweep 2026-05-27 (deep-sweep-api-consistency-zonal-2026-05-27). 1 HIGH Cat 1 finding fixed in this branch (#2521): crop() used zones_ids while stats/crosstab use zone_ids -- pure typo creating a TypeError trap when switching between sibling zonal functions. Fix accepts both, deprecates zones_ids with DeprecationWarning, raises if both supplied, raises if neither. All call sites in tests migrated to canonical zone_ids; legacy zones_ids paths covered by new regression tests. Other findings not fixed in this PR: (HIGH Cat 1+4) nodata vs nodata_values drift across stats/crosstab (nodata_values=None) vs apply/hypsometric_integral (nodata=0) -- different name AND different default, breaks substitutability; cross-function scope, needs a design issue. (MEDIUM Cat 3) crosstab docstring says 'layer: int, default=0' but signature is 'Optional[int] = None'. (MEDIUM Cat 3) hypsometric_integral lacks all type annotations; apply and crop lack return type annotations (siblings have them). (MEDIUM Cat 5) get_full_extent has public-style docstring with 'from xrspatial.zonal import get_full_extent' example but is not in __init__.py -- borderline orphan, but minor utility. (LOW Cat 3) apply() docstring mixes 'values' parameter name with 'agg' prose; example returns np.array shape (not DataArray) while function actually returns a DataArray. Cross-cutting: zones/raster as first-arg name varies (zonal.stats uses zones; zonal.regions/trim use raster). Regions/trim are single-array operations on the zone raster itself, so the rename arguably matches the role. Documented, not filed. cuda-validated: CUDA_AVAILABLE=True on this host."
34 changes: 24 additions & 10 deletions xrspatial/reproject/_vertical.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,9 +212,13 @@ def geoid_height(lon, lat, model='EGM96'):

Returns
-------
N : same type as input
N : float or numpy.ndarray
Geoid undulation in metres. Positive means the geoid is above
the ellipsoid.
the ellipsoid. A Python ``float`` when both *lon* and *lat* are
scalars; otherwise a ``numpy.ndarray`` with the same shape as
the inputs. Array-like and ``xr.DataArray`` inputs both come
back as a plain ndarray (coords and attrs are not carried
through).

Examples
--------
Expand Down Expand Up @@ -330,8 +334,10 @@ def ellipsoidal_to_orthometric(height, lon, lat, model='EGM96'):

Returns
-------
H : same type as height
Orthometric height in metres.
H : numpy.ndarray or numpy scalar
Orthometric height in metres. The input is passed through
``np.asarray``, so scalar input returns a numpy scalar and
array-like or ``xr.DataArray`` input returns a plain ndarray.
"""
N = geoid_height(lon, lat, model)
return np.asarray(height) - N
Expand All @@ -353,8 +359,10 @@ def orthometric_to_ellipsoidal(height, lon, lat, model='EGM96'):

Returns
-------
h : same type as height
Ellipsoidal height in metres.
h : numpy.ndarray or numpy scalar
Ellipsoidal height in metres. The input is passed through
``np.asarray``, so scalar input returns a numpy scalar and
array-like or ``xr.DataArray`` input returns a plain ndarray.
"""
N = geoid_height(lon, lat, model)
return np.asarray(height) + N
Expand All @@ -378,8 +386,11 @@ def depth_to_ellipsoidal(depth, lon, lat, model='EGM96'):

Returns
-------
h : same type as depth
Ellipsoidal height in metres (negative below ellipsoid).
h : numpy.ndarray or numpy scalar
Ellipsoidal height in metres (negative below ellipsoid). The
input is passed through ``np.asarray``, so scalar input returns
a numpy scalar and array-like or ``xr.DataArray`` input returns
a plain ndarray.
"""
N = geoid_height(lon, lat, model)
return -np.asarray(depth) + N
Expand All @@ -403,8 +414,11 @@ def ellipsoidal_to_depth(height, lon, lat, model='EGM96'):

Returns
-------
depth : same type as height
Depth below chart datum in metres (positive downward).
depth : numpy.ndarray or numpy scalar
Depth below chart datum in metres (positive downward). The
input is passed through ``np.asarray``, so scalar input returns
a numpy scalar and array-like or ``xr.DataArray`` input returns
a plain ndarray.
"""
N = geoid_height(lon, lat, model)
return N - np.asarray(height)
45 changes: 45 additions & 0 deletions xrspatial/tests/test_reproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -6925,3 +6925,48 @@ def test_source_coords_match_pyproj_for_osgb36(self):
# Guard against the old corruption: coords must be metres, not degrees.
assert np.all(np.abs(src_x) > 1000.0)
assert np.all(np.abs(src_y) > 1000.0)


class TestVerticalReturnTypes:
"""Pin the return types the _vertical.py docstrings describe (#3097).

The Returns sections used to claim "same type as input", which was
wrong for DataArray input (plain ndarray comes back) and for scalar
input to the conversion wrappers (numpy scalar, not Python float).
These tests pin the actual behaviour the docs now state.
"""

def test_geoid_height_scalar_returns_python_float(self):
from xrspatial.reproject import geoid_height
out = geoid_height(-74.0, 40.7)
assert type(out) is float

def test_geoid_height_array_returns_ndarray(self):
from xrspatial.reproject import geoid_height
out = geoid_height(np.array([-74.0, 0.0]), np.array([40.7, 0.0]))
assert type(out) is np.ndarray
assert out.shape == (2,)

def test_geoid_height_dataarray_returns_ndarray(self):
from xrspatial.reproject import geoid_height
lon = xr.DataArray(np.array([-74.0, 0.0]))
lat = xr.DataArray(np.array([40.7, 0.0]))
out = geoid_height(lon, lat)
# Documented: DataArray input comes back as a plain ndarray.
assert type(out) is np.ndarray

def test_conversion_wrappers_return_numpy_types(self):
from xrspatial.reproject import (
depth_to_ellipsoidal,
ellipsoidal_to_depth,
ellipsoidal_to_orthometric,
orthometric_to_ellipsoidal,
)
for func in (ellipsoidal_to_orthometric, orthometric_to_ellipsoidal,
depth_to_ellipsoidal, ellipsoidal_to_depth):
scalar_out = func(100.0, -74.0, 40.7)
assert isinstance(scalar_out, np.floating), func.__name__
arr_out = func(np.array([100.0, 50.0]),
np.array([-74.0, 0.0]), np.array([40.7, 0.0]))
assert type(arr_out) is np.ndarray, func.__name__
assert arr_out.shape == (2,), func.__name__
Loading