Skip to content

GeoTIFF writer silently flattens distinct per-band nodatavals #2514

@brendancol

Description

@brendancol

Describe the bug

The GeoTIFF writer silently flattens distinct per-band nodatavals into a single scalar, corrupting missing-data semantics.

_resolve_nodata_attr in xrspatial/geotiff/_attrs.py:1246 picks the first usable value from attrs["nodatavals"]. The conflict check in xrspatial/geotiff/_validation.py:1000 allows differing per-band values when one band happens to match the selected scalar, and skips the check entirely when only nodatavals is present (no scalar nodata key).

Reproduction (public API)

import numpy as np
import xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff

band1 = np.array([[1.0, -9999.0], [3.0, 4.0]], dtype=np.float32)
band2 = np.array([[2.0, -8888.0], [5.0, 6.0]], dtype=np.float32)
data = np.stack([band1, band2])
da = xr.DataArray(
    data,
    dims=("band", "y", "x"),
    coords={"band": [1, 2], "y": [1.0, 0.0], "x": [0.0, 1.0]},
    attrs={"nodatavals": (-9999.0, -8888.0)},
)

to_geotiff(da, "out.tif")
out = open_geotiff("out.tif")
# written nodata: -9999.0
# band 0: NaN at the -9999 cell  (correctly masked)
# band 1: -8888.0 preserved as real data  (sentinel became real data)

Same corruption with attrs={"nodata": -9999, "nodatavals": (-9999, -8888)} because one band's sentinel happens to match the canonical scalar, so the conflict check returns without raising.

Expected behavior

The writer should reject distinct per-band sentinels with a clear error rather than silently dropping one. The repo already documents the correct limitation in xrspatial/geotiff/tests/parity/test_reference.py:627: a TIFF GDAL_NODATA value is file-wide, so distinct per-band sentinels cannot safely round-trip in this representation.

Affected paths

The faulty validation feeds CPU writes in xrspatial/geotiff/_writers/eager.py:472, GPU writes in xrspatial/geotiff/_writers/gpu.py:404, and tiled/VRT-driven writing through the same eager validation path.

Coverage gap

  • xrspatial/geotiff/tests/unit/test_metadata.py:1766 rejects only the case where every tuple member disagrees with the canonical value.
  • xrspatial/geotiff/tests/read/test_nodata.py:248 covers single-valued nodatavals, not distinct per-band sentinels.

Proposed fix

  • Tighten _validation.py so any time nodatavals contains 2+ distinct concrete (non-NaN, non-None) sentinels, validation raises a clear error on write, regardless of whether nodata is also provided or whether one of them happens to equal the flattened scalar.
  • Update _attrs.py:1246 flattening so it does not silently pick a winner on disagreement; the error must surface from validation before the writer sees a flattened scalar.
  • Cover both CPU (_writers/eager.py) and GPU (_writers/gpu.py) paths since they share validation.
  • Add tests for the dangerous case (distinct per-band sentinels, with and without an accompanying scalar nodata key) and confirm the error fires before any file write.
  • Allow the safe cases: a single distinct concrete value (possibly repeated), all-NaN, or Nones mixed with a single concrete value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggeotiffGeoTIFF module

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions