Skip to content

Make dask float polygonize chunk-invariant for rtol#2675

Merged
brendancol merged 4 commits into
mainfrom
issue-2666
May 29, 2026
Merged

Make dask float polygonize chunk-invariant for rtol#2675
brendancol merged 4 commits into
mainfrom
issue-2666

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2666.

What

  • numpy CCL groups adjacent float pixels with abs(value - reference) <= atol + rtol*abs(reference), where reference is the higher-ij pixel in scan order, so the test is asymmetric in rtol. The dask cross-chunk merge compared unordered (min, max) ranges, so its merge decision flipped with chunk order.
  • Each chunk-boundary polygon now carries its internal-edge pixel values keyed by global (col, row). The cross-chunk merge applies numpy's higher-ij-is-reference orientation at the exact pixel pair straddling a boundary, so a chunked float raster produces the same polygon partition as the unchunked input.
  • Integer rasters keep strict equality (empty cell map, range fallback).

Backends

Float path: numpy / cupy (float routes through numpy), dask+numpy, dask+cupy. The merge change lives in the shared dask path used by both dask+numpy and dask+cupy. Integer behaviour is unchanged.

Known limitation

8-connectivity at large rtol still diverges between numpy and dask. That is a pre-existing ring-merge topology limitation (present on main before this change), not the rtol orientation bug, so it is left for a separate fix.

Test plan

  • pytest xrspatial/tests/test_polygonize_issue_2666.py (29 passed)
  • Full polygonize suite still green (test_polygonize.py, _issue_2583, _issue_2172, coverage files): 291 passed, 16 skipped
  • Fuzz across shapes / chunkings / rtol: 4-connectivity now matches numpy in every case (was 26 failures on main)

Dedupe duplicate module rows (last-write-wins by last_inspected) and
collapse multi-line notes to single physical lines. The notes had
embedded newlines, which the merge=union .gitattributes strategy splits
record-by-record, corrupting the file into a 156-column phantom row on
parallel-agent appends. One line per record keeps union merges safe.
numpy CCL tests adjacent pixels with abs(value - reference) <= atol +
rtol*abs(reference), where reference is the higher-ij pixel in scan
order, so the predicate is asymmetric in rtol. The dask cross-chunk
merge compared unordered (min, max) value ranges, so its merge decision
flipped with chunk order: [[10.0, 11.05]] at atol=0, rtol=0.1 merged
under numpy but split under dask, and reversing the row flipped the
failure.

Each chunk-boundary polygon now carries the values of its pixels on
internal chunk edges, keyed by global (col, row). The cross-chunk merge
applies numpy's higher-ij-is-reference orientation at the exact pixel
pair straddling a boundary, so a chunked float raster produces the same
polygon partition as the unchunked input. Integer rasters keep strict
equality (empty cell map, range fallback).

The remaining 8-connectivity divergence at large rtol is a pre-existing
ring-merge topology limitation, not the rtol orientation bug, and is
left untouched.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Make dask float polygonize chunk-invariant for rtol

Blockers

None.

Suggestions

  • 8-connectivity at large rtol still diverges between numpy and dask (a dense 4x4 float array with rtol=0.1 gives different polygon counts). This is pre-existing on main, not introduced here. Tracing shows it is a ring-merge topology issue, not a close-value orientation issue: it fails identically across chunk shapes that do not even split the relevant pixels, so the cross-chunk boundary logic is not the cause. The PR scopes around it correctly. Worth a follow-up issue so the limitation is tracked outside the PR body.

Nits

  • _ranges_close (polygonize.py:295) and its interval-overlap fallback are now reached only for integer rasters and the defensive empty-cell-map case. For integers the check reduces to equality, so the overlap branch no longer fires in practice. Harmless and pre-existing, but a one-line comment marking it as a fallback path would help the next reader.

What looks good

  • The fix targets the root cause: numpy CCL uses the higher-ij pixel as the rtol reference, and the merge now replicates that orientation at the exact boundary pixel pair instead of comparing unordered ranges.
  • Boundary-cell maps are O(chunk perimeter), keyed by global coords, so the dask path gains no full-array materialization.
  • Integer rasters keep strict equality through the empty cell map and range fallback.
  • NaN boundary cells are skipped during extraction, so they never merge, matching numpy.
  • Tests cover the repro, its reverse, column/row orientation, disconnected close-valued pixels, several chunkings, the default tolerance path, integer invariance, and dask+cupy.

Checklist

  • Algorithm matches numpy CCL scan-order orientation
  • Implemented backends consistent (4-conn float verified; integer unchanged)
  • NaN handling correct
  • Edge cases covered by tests
  • Dask chunk boundaries handled correctly
  • No premature materialization or unnecessary copies
  • Benchmark not needed (bug fix)
  • README/docs unchanged (no public API change)
  • Docstrings updated for the new tuple field and helper

…2666)

- Document that _ranges_close is now the fallback close-value test
  (integers and the empty-cell defensive case), since float merging goes
  through the direction-aware _cells_close_directed.
- Add dask+cupy parity tests for the repro and its reverse, guarded by
  the cuda/cupy availability marker.

Follow-up issue #2677 tracks the pre-existing 8-connectivity divergence
at large rtol, which is a ring-merge topology problem out of scope here.

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after review commits)

Re-reviewed the two follow-up commits.

Disposition of prior findings

  • Suggestion (8-conn divergence at large rtol): deferred to follow-up issue #2677. The fix is a separate ring-merge topology problem, pre-existing on main, and would expand this PR well beyond the rtol orientation bug.
  • Nit (_ranges_close now a fallback path): fixed. Its docstring now states it is the fallback for integers and the empty-cell defensive case, with float merging routed through _cells_close_directed.

New changes

  • Added dask+cupy parity tests for the repro and its reverse, guarded by the cuda/cupy marker. They skip locally without a GPU.

No new blockers, suggestions, or nits. The 4-connectivity rtol parity is complete and the existing suite stays green (47 passed, 3 skipped on the polygonize parity files).

@brendancol brendancol merged commit 9eefbeb into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dask float polygonize is not chunk-invariant for rtol

1 participant