open_geotiff(masked=True) falsely masks valid pixels near 64-bit integer sentinels on eager backends

**Describe the bug**

`_apply_eager_nodata_mask` in `xrspatial/geotiff/_attrs.py` promotes integer buffers to float64 and then compares against the sentinel:

```python
arr = arr.astype(np.float64)
mask = arr == np.float64(nodata_int)
```

For int64/uint64 rasters with a sentinel above 2**53, float64 rounding makes nearby valid values compare equal to the sentinel. With `nodata=INT64_MAX`, every value in `[INT64_MAX - 512, INT64_MAX]` rounds to 2**63 and gets masked to NaN. With UINT64_MAX the window is 1024 values wide.

The dask chunk path (`_delayed_read_window` in `_backends/dask.py`), the GPU GDS chunk path (`_apply_nodata_mask_gpu` in `_backends/_gpu_helpers.py`), and the VRT path all compute the mask at native integer width before promoting, so they only mask exact sentinel hits. The same file read with `masked=True` therefore gives different results on the eager backends than on dask:

```python
i64max = np.iinfo(np.int64).max
data = np.array([[i64max, i64max - 1, i64max - 100],
                 [i64max - 511, i64max - 512, i64max - 513],
                 [1000, 2000, 3000]], dtype=np.int64)
da = xr.DataArray(data, dims=('y', 'x'),
                  coords={'y': [2.5, 1.5, 0.5], 'x': [0.5, 1.5, 2.5]})
to_geotiff(da, path, nodata=i64max, compression='deflate')

open_geotiff(path, masked=True)            # 4 NaNs (3 are valid pixels)
open_geotiff(path, masked=True, chunks=2)  # 1 NaN (correct)
```

The function is also inconsistent with itself: the `mask_nodata=False` scan a few lines down compares at native width (`arr == arr.dtype.type(nodata_int)`), so `nodata_pixels_present` and the actual masking can disagree about which pixels are nodata.

The write side already defends against exactly this. `_overview_kernels.py` keeps integer reductions on the numpy path because the sentinel mask has to be computed at native integer width before any float64 promotion (64-bit sentinels like INT64_MAX round when cast), and `tests/write/test_overview.py` pins that behavior for UINT64_MAX. The eager read path contradicts the module's own convention.

**Expected behavior**

Eager masked reads mask only pixels that exactly equal the sentinel at the source dtype's width, matching the dask, GPU-chunked, and VRT read paths. Affected backends: numpy eager and cupy eager (both route through `_apply_eager_nodata_mask`; the GPU eager sites call it via duck-typing from `_finalize_eager_read`).

**Additional context**

Found by the accuracy sweep against the geotiff module. Promoting int64 data above 2**53 to float64 is lossy regardless, but the mask should not convert valid pixels to NaN, and the four backends should agree.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

open_geotiff(masked=True) falsely masks valid pixels near 64-bit integer sentinels on eager backends #3098

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

open_geotiff(masked=True) falsely masks valid pixels near 64-bit integer sentinels on eager backends #3098

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions