Defer the pack=True NaN guard into the dask graph so chunks compute once#3237
Conversation
…3235) The no-sentinel integer restore in _pack scanned for NaN with an eager isnull().any(), executing the whole upstream graph at call time; the streaming writer then executed it again for the pixels. Map the guard per chunk into the graph for dask-backed data so it raises from the write's single compute. numpy keeps the eager call-time check. Also update the geotiff performance-sweep state for pass 15.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Defer the pack=True NaN guard into the dask graph so chunks compute once
Blockers (must fix before merge)
- None found.
Suggestions (should fix, not blocking)
-
tests/write/test_pack_lazy_nan_guard_3235.py:50-- the execution-counter test pins the tiled.tifpath only. A.vrtdestination consumes the packed graph through_write_vrt_tiled's delayed tile writes instead of_write_streaming, and nothing covers that the deferred ValueError still surfaces (same message) through_safe_write_tile's collect-and-reraise machinery. One small test writing the NaN no-sentinel dask array to a.vrtpath would close that.
Nits (optional improvements)
-
xrspatial/geotiff/_attrs.py:1823-- the guard is mapped even whenout.dtypeis non-float, where it can only be a no-op (the(data - offset) / scalearithmetic promotes to float in every reachable case). The chunk-leveldtype.kind == 'f'check already makes this safe; this is graph noise in an unreachable case at most. Fine to leave. -
xrspatial/geotiff/_attrs.py:1707--bool(np.isnan(chunk).any())allocates a chunk-sized bool mask per chunk.np.isnan(np.min(chunk))would avoid it, but the mask is chunk/8 bytes and theisnan(...).any()form matches the rest of the module. Consistency wins; noting for completeness only.
What looks good
- The fix lands exactly where the issue points: the eager
isnull().any()is gone for dask input, and the measurement in the PR body (decode executions 32 to 16) matches an independent rerun during this review. meta=out.data._metaon themap_blockscall keeps cupy backing intact and avoids the meta-inference probe that would otherwise execute the guard once extra at graph build.- The counter test threads a counting identity block through the graph rather than parsing fused task names, so it survives scheduler fusion changes.
- The error message is byte-identical between the eager and deferred paths, and
test_pack_dask_nan_no_sentinel_packed_graph_is_lazypins that_packitself stays lazy even when NaN is present. - The docstring claim about atomicity was verified against
_writer.py(mkstemp +os.replace+ unlink on failure) rather than asserted. - The sweep CSV change is a single row with LF endings preserved, avoiding the known CRLF whole-file conflict hazard.
Checklist
- Algorithm matches reference (inverse-scale semantics unchanged; guard placement before round/astype is correct)
- All implemented backends produce consistent results (numpy eager, dask deferred; cupy guard unit-tested, e2e blocked by #3112 upstream)
- NaN handling is correct (
np.isnan, not== np.nan; float-kind gate) - Edge cases covered (clean chunk, NaN chunk, integer chunk, cupy chunk, parity round trip)
- Dask chunk boundaries handled correctly (elementwise map_blocks, no overlap needed)
- No premature materialization or unnecessary copies (that is the point of the PR)
- Benchmark not needed (regression pinned by execution-count test instead)
- README feature matrix not applicable (no new function, no backend change)
- Docstrings present and accurate (error-timing change documented on both
to_geotiffand_pack)
Review follow-up: the guard's ValueError must come back verbatim through _write_vrt_tiled's collect-and-reraise path and leave no tiles dir behind.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after 5798562)
The suggestion from the first pass is addressed: test_pack_dask_nan_no_sentinel_raises_through_vrt_write now covers the .vrt destination, asserting the deferred ValueError comes back verbatim through _write_vrt_tiled's collect-and-reraise path and that neither the .vrt file nor the <stem>_tiles dir is left behind (the assertion names match the writer's actual stem + '_tiles' construction at _writers/eager.py:1264). All 10 tests in the file pass.
The two nits stand as dismissed for the reasons given in the first review: the non-float mapping case is unreachable and already handled at chunk level, and the isnan(...).any() form matches the module's prevailing style.
No new findings. Nothing blocking.
Closes #3235.
_pack's no-sentinel integer restore scanned for NaN with an eagerbool(out.isnull().any()), which executed the entire upstream dask graph atto_geotiff(pack=True)call time. The streaming writer then executed the graph again for the pixels, so every source chunk computed twice. The guard now maps into the graph per chunk (_pack_guard_no_nan) for dask-backed data and raises from the write's single compute. numpy-backed data keeps the eager call-time check.Measured on a 512x512 int16 source with SCALE/OFFSET and no GDAL_NODATA, read with
unpack=True, chunks=128: decode-task executions drop from 32 to 16 (total task starts 71 to 33), round-trip parity unchanged.Error timing for the dask NaN-with-no-sentinel case moves from the
to_geotiffcall to the write's compute; noted into_geotiff'spackdocstring and in_pack. The write stays atomic (temp file plus rename), so a failed pack leaves no partial output at the destination.Backend coverage: dask (CPU) is the affected path and gets the fix. numpy keeps its eager behavior. The guard itself is backend-agnostic (
np.isnandispatches on the array type,meta=preserves cupy backing) and has a cupy-chunk unit test; full dask+cupy pack round trips remain blocked upstream by #3112.Also carries the geotiff performance-sweep state update for pass 15 (one CSV row).
Test plan:
tests/write/test_pack_lazy_nan_guard_3235.py: fusion-proof per-chunk execution counter, eager/lazy round-trip parity, dask raise-at-compute, numpy raise-at-call, guard unit tests on numpy and cupy chunks