Skip to content

polygonize numpy backend emits spurious polygons for NaN pixels #1190

@brendancol

Description

@brendancol

Describe the bug

The numpy backend for polygonize() doesn't mask NaN pixels in float rasters. Each NaN pixel gets its own region and produces a single-pixel polygon with a NaN value.

The CuPy backend handles this correctly -- it builds a valid-pixel mask via ~cp.isnan(data) so NaN pixels get region 0 (masked out, no polygon). The numpy and dask backends skip this step. NaN pixels pass through _is_close, which returns False for all NaN comparisons, so every NaN pixel becomes its own isolated region.

Reproducer

import numpy as np
import xarray as xr
from xrspatial.polygonize import polygonize

data = np.array([[1.0, np.nan], [np.nan, 1.0]], dtype=np.float64)
raster = xr.DataArray(data)
values, polygons = polygonize(raster)
print(values)   # [1.0, nan, nan, 1.0]  -- two bogus NaN polygons
print(len(polygons))  # 4 instead of expected 2

Expected behavior

NaN pixels should be treated as masked (no polygon emitted), matching CuPy. The example above should return 2 polygons (both with value 1.0) and total area of 2.0, not 4 polygons with total area 4.0.

Root cause

_polygonize_numpy passes float arrays straight to _calculate_regions without folding in a NaN mask. Fix: detect float dtype in _polygonize_numpy and merge ~np.isnan(values) into the mask array before calling _calculate_regions, same approach the CuPy backend already uses.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions