Describe the bug
reproject() silently corrupts integer rasters when the caller doesn't pass an explicit nodata. _detect_nodata() defaults to np.nan, but the integer output path rounds, clips, and casts back to the input integer dtype. NaN doesn't survive that cast, so out-of-bounds pixels become 0 in the array while attrs['nodata'] still says nan. Anything downstream reading that array treats those zero pixels as real data.
Repro
import numpy as np
import xarray as xr
from xrspatial.reproject import reproject
data = np.full((64, 64), 100, dtype=np.int16)
y = np.linspace(60, 50, 64)
x = np.linspace(-10, 10, 64)
raster = xr.DataArray(
data, dims=['y', 'x'], coords={'y': y, 'x': x},
attrs={'crs': 'EPSG:4326'},
)
result = reproject(raster, 'EPSG:32633')
print(result.dtype) # int16
print(result.attrs['nodata']) # nan
print(np.unique(result.values)) # [0 100]
print((result.values == 0).sum()) # 435, all of these are nodata
You also get a RuntimeWarning: invalid value encountered in cast, which is the only signal anything is off.
Where this lives
xrspatial/reproject/_crs_utils.py:157: _detect_nodata() returns float('nan') as the default.
xrspatial/reproject/__init__.py:326-329: integer output is rounded, clipped, and cast back, which collapses NaN to 0.
Expected behavior
For integer output dtypes with no user-supplied nodata, pick a sensible integer sentinel (dtype min for signed, dtype max for unsigned, following rasterio/GDAL convention) and make sure attrs['nodata'] matches what's actually in the array. The sentinel needs to flow through the worker fill, attrs['nodata'], and the merge paths.
Environment
- xarray-spatial main
- numpy, xarray, pyproj installed
Describe the bug
reproject()silently corrupts integer rasters when the caller doesn't pass an explicitnodata._detect_nodata()defaults tonp.nan, but the integer output path rounds, clips, and casts back to the input integer dtype. NaN doesn't survive that cast, so out-of-bounds pixels become0in the array whileattrs['nodata']still saysnan. Anything downstream reading that array treats those zero pixels as real data.Repro
You also get a
RuntimeWarning: invalid value encountered in cast, which is the only signal anything is off.Where this lives
xrspatial/reproject/_crs_utils.py:157:_detect_nodata()returnsfloat('nan')as the default.xrspatial/reproject/__init__.py:326-329: integer output is rounded, clipped, and cast back, which collapses NaN to 0.Expected behavior
For integer output dtypes with no user-supplied nodata, pick a sensible integer sentinel (dtype min for signed, dtype max for unsigned, following rasterio/GDAL convention) and make sure
attrs['nodata']matches what's actually in the array. The sentinel needs to flow through the worker fill,attrs['nodata'], and the merge paths.Environment