Skip to content

Dask 8-connectivity float polygonize diverges from numpy at large rtol #2677

@brendancol

Description

@brendancol

Describe the bug

With connectivity=8 and a large rtol, polygonize on a dask-backed float raster can produce a different number of polygons (and different DN values) than the same raster as a single numpy chunk. This is separate from the rtol scan-order orientation fixed in #2666 / #2675.

Reproduce

import numpy as np, xarray as xr, dask.array as da
from xrspatial import polygonize

arr = np.array([
    [2.098, 2.43,  2.206, 2.09 ],
    [1.847, 2.292, 1.875, 2.784],
    [2.927, 1.767, 2.583, 2.058],
    [2.136, 2.851, 1.142, 1.174],
])
v_np, _ = polygonize(xr.DataArray(arr), atol=0.0, rtol=0.1, connectivity=8)
v_dk, _ = polygonize(
    xr.DataArray(da.from_array(arr, chunks=(2, 2))),
    atol=0.0, rtol=0.1, connectivity=8)
print(len(v_np), len(v_dk))  # 6 vs 4

What I found

This is a ring-merge topology problem, not a boundary close-value problem. It fails identically across chunk shapes that do not even split the relevant pixels (for example chunks of (1, 4), (4, 1), and (2, 2) all reproduce it), so the cross-chunk boundary decision is not the cause. The likely area is the degree-4 vertex pairing in _pick_next_edge / the figure-8 handling in _merge_polygon_rings where 8-connected diagonal regions meet.

#2675 fixes the 4-connectivity rtol case completely and the large majority of 8-connectivity cases; this issue tracks the remaining 8-connectivity divergence at large rtol.

Expected behavior

A chunked 8-connectivity float raster produces the same polygon partition as the unchunked input for the same atol / rtol.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdaskDask backend / chunked arraysseverity:highSweep finding: HIGH

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions