Reason or Problem
ds.xrs.open_geotiff(source) exists on the Dataset accessor (xrspatial/accessor.py:1068) and reads only the pixel window covered by the Dataset's y/x coords. Three gaps:
- There is no DataArray counterpart.
da.xrs.to_geotiff(...) is defined but da.xrs.open_geotiff(...) is not, which is inconsistent and forces a Dataset wrap to use the windowed-read shortcut.
- Neither accessor adapts the read to the caller's backend. A dask+cupy DataArray that calls the accessor receives an eager numpy DataArray back, which then has to be rechunked and moved to GPU manually.
- The windowing path silently assumes the caller's coords are in the file's CRS. If they are not, the computed window is wrong and the read returns the wrong region with no error.
Proposal
Add da.xrs.open_geotiff(source, **kwargs) to the DataArray accessor and enhance both accessors to:
- Infer the caller's backend (numpy / cupy / dask+numpy / dask+cupy) via
_classify_backend and pass matching gpu= / chunks= to xrspatial.geotiff.open_geotiff so the returned DataArray's backend matches self. Caller-supplied gpu= / chunks= always win.
- On CRS mismatch between caller and file, raise a clear
ValueError by default (replacing today's silently-wrong window). With auto_reproject=True, project the caller's bbox to the file's CRS for the windowed read, then reproject the read result back to the caller's CRS via xrspatial.reproject.reproject so the returned DataArray lines up with self.
Design:
- A new private helper in
xrspatial/accessor.py (e.g. _open_geotiff_windowed(obj, source, *, auto_reproject, **kwargs)) holds the shared body. Both accessor methods delegate to it.
- Backend inference uses
_classify_backend(obj) from xrspatial/utils.py:345. Mapping:
numpy -> no override
cupy -> gpu=True
dask+numpy -> chunks=<caller's y-chunk size>
dask+cupy -> gpu=True, chunks=<caller's y-chunk size>
- CRS comes from
attrs['crs'] (int EPSG, the convention established by the geotiff reader). File CRS comes from _read_geo_info(source).crs_epsg.
- The reproject path uses
pyproj.Transformer to convert the caller's bbox to the file CRS for windowing, then xrspatial.reproject.reproject(read_da, target_crs=caller_crs, source_crs=file_crs) for the final result. reproject already supports all four backends, so backend match is preserved.
- The new flag is
auto_reproject: bool = False. Matches the allow_* / auto_* style already used in open_geotiff.
Usage:
import xrspatial # registers .xrs accessor
import xarray as xr
# DataArray-side, backend inferred from caller
da = some_dask_cupy_dataarray # EPSG:4326
out = da.xrs.open_geotiff("aoi.tif") # raises if file CRS != 4326
out = da.xrs.open_geotiff("aoi.tif", auto_reproject=True) # returns dask+cupy in EPSG:4326
# Dataset-side keeps existing behavior; gains the same flag
out = ds.xrs.open_geotiff("aoi.tif", auto_reproject=True)
Value:
- Closes the symmetry gap between
da.xrs.to_geotiff and da.xrs.open_geotiff.
- Makes the windowed-read shortcut backend-correct by default, removing manual rechunk plus
.map_blocks(cupy.asarray) boilerplate at every call site.
- Turns today's silent CRS mismatch (wrong window, no error) into either a loud error or a correct, reprojected result, controlled by a single flag.
Stakeholders and Impacts
Affects xrspatial/accessor.py only (new helper plus two accessor methods). Uses existing public utilities (_classify_backend, _read_geo_info, _extent_to_window, xrspatial.reproject.reproject). No changes to xrspatial/geotiff/ or xrspatial/reproject/. I will implement.
Drawbacks
- Default behavior on CRS mismatch changes from "silently wrong window" to "raises ValueError". This is a behavior change for any caller relying on the wrong-but-silent behavior; the upside is that the new error message points directly at
auto_reproject=True.
- Adds an import-time dependency on
pyproj.Transformer inside the accessor body. The rest of the repo already depends on pyproj via the reproject module, so this is a no-op for dependency scope.
Alternatives
- Leave the DataArray accessor empty and document the Dataset wrap workaround. Rejected: keeps the asymmetry and the silent-CRS-bug.
- Always reproject on mismatch with no flag. Rejected: surprising for callers who explicitly want native-CRS data.
Unresolved Questions
None.
Reason or Problem
ds.xrs.open_geotiff(source)exists on the Dataset accessor (xrspatial/accessor.py:1068) and reads only the pixel window covered by the Dataset'sy/xcoords. Three gaps:da.xrs.to_geotiff(...)is defined butda.xrs.open_geotiff(...)is not, which is inconsistent and forces a Dataset wrap to use the windowed-read shortcut.Proposal
Add
da.xrs.open_geotiff(source, **kwargs)to the DataArray accessor and enhance both accessors to:_classify_backendand pass matchinggpu=/chunks=toxrspatial.geotiff.open_geotiffso the returned DataArray's backend matchesself. Caller-suppliedgpu=/chunks=always win.ValueErrorby default (replacing today's silently-wrong window). Withauto_reproject=True, project the caller's bbox to the file's CRS for the windowed read, then reproject the read result back to the caller's CRS viaxrspatial.reproject.reprojectso the returned DataArray lines up withself.Design:
xrspatial/accessor.py(e.g._open_geotiff_windowed(obj, source, *, auto_reproject, **kwargs)) holds the shared body. Both accessor methods delegate to it._classify_backend(obj)fromxrspatial/utils.py:345. Mapping:numpy-> no overridecupy->gpu=Truedask+numpy->chunks=<caller's y-chunk size>dask+cupy->gpu=True, chunks=<caller's y-chunk size>attrs['crs'](int EPSG, the convention established by the geotiff reader). File CRS comes from_read_geo_info(source).crs_epsg.pyproj.Transformerto convert the caller's bbox to the file CRS for windowing, thenxrspatial.reproject.reproject(read_da, target_crs=caller_crs, source_crs=file_crs)for the final result.reprojectalready supports all four backends, so backend match is preserved.auto_reproject: bool = False. Matches theallow_*/auto_*style already used inopen_geotiff.Usage:
Value:
da.xrs.to_geotiffandda.xrs.open_geotiff..map_blocks(cupy.asarray)boilerplate at every call site.Stakeholders and Impacts
Affects
xrspatial/accessor.pyonly (new helper plus two accessor methods). Uses existing public utilities (_classify_backend,_read_geo_info,_extent_to_window,xrspatial.reproject.reproject). No changes toxrspatial/geotiff/orxrspatial/reproject/. I will implement.Drawbacks
auto_reproject=True.pyproj.Transformerinside the accessor body. The rest of the repo already depends on pyproj via the reproject module, so this is a no-op for dependency scope.Alternatives
Unresolved Questions
None.