Skip to content

reproject: dask backend computes float64 for integer rasters when output chunks miss the source footprint #3096

@brendancol

Description

@brendancol

Describe the bug

_reproject_dask builds its map_blocks template with the source's integer dtype (#2505), but the empty-chunk returns are hardcoded to float64:

  • _reproject_chunk_numpy: the non-finite, no-overlap, and oversized-window early returns all do np.full(_empty_shape, nodata, dtype=np.float64)
  • _reproject_chunk_cupy: same, with cp.full
  • _reproject_block_adapter: the footprint-skip path returns np.full(chunk_shape, nodata, dtype=np.float64)

When any output chunk has no overlap with the source footprint, that chunk comes back float64 while the data chunks come back int16/uint8/etc. dask stitches the blocks with type promotion, so the computed array is float64 even though the lazy array advertises the integer dtype.

Repro:

import numpy as np, xarray as xr, dask.array as da
from xrspatial.reproject import reproject

data = (np.random.rand(200, 200) * 1000).astype(np.int16)
r = xr.DataArray(da.from_array(data, chunks=(64, 64)), dims=['y', 'x'],
                 coords={'y': np.linspace(52, 51, 200), 'x': np.linspace(-2, -1, 200)},
                 attrs={'crs': 'EPSG:4326', 'nodata': -32768})
out = reproject(r, 'EPSG:3857', bounds=(-2e6, 5e6, 2e6, 9e6),
                resolution=20000, chunk_size=64)
print(out.dtype)            # int16
print(out.compute().dtype)  # float64

The eager numpy backend returns int16 for the same input, so the backends disagree on output dtype. Values are still correct (the integer sentinel is representable in float64); the dtype contract is what breaks.

Expected behavior

Empty chunks should use the dtype the template advertises: integer source means np.full(shape, nodata, dtype=src_dtype) (the resolved nodata is guaranteed representable for integer rasters per #2185/#2572), float source stays float64.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions