Refresh nodata/nodatavals on resample identity fast path (#2662)#2670
Merged
Conversation
Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.
The identity (scale_factor=1.0) and 3D dispatch paths masked sentinel values to NaN but only updated _FillValue, leaving attrs['nodata'] and attrs['nodatavals'] advertising the stale finite sentinel. Factor the non-identity path's refresh into _refresh_nodata_attrs and call it from all three paths so masked-to-NaN output never advertises a stale integer sentinel.
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Refresh nodata/nodatavals on resample identity fast path (#2662)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
xrspatial/resample.pyidentity path (~L1392) and non-identity path (~L1503): the explicit_FillValue = float('nan')line is redundant whenever_FillValueis already in the input attrs, since_refresh_nodata_attrssets it too. It is still needed for the explicit-nodata=-only case where no_FillValueattr exists, so leaving it is correct. A one-line comment noting that would help the next reader, but it is not required.
What looks good
- The fix is symmetric across the three code paths: identity, 3D, and non-identity now all run
if has_nodata: _FillValue=nanfollowed by_refresh_nodata_attrs, so they agree on what the output advertises. - The shared helper reads from the original
src_attrsand writes todst_attrs. That avoids the order-dependence bug you would hit if it read the partially-updated output attrs instead. - Tests hit the regression directly: masked pixel is NaN and no attr advertises the stale sentinel, identity matches the non-identity path, absent attrs stay absent, and the 3D path is covered.
- The cupy/dask+cupy parametrize was dropped with a documented reason. The attr fix is backend-independent, so numpy + dask+numpy is enough.
Checklist
- Algorithm matches intended behavior
- Backends produce consistent results (attr fix is backend-independent)
- NaN handling is correct
- Edge cases covered (absent attrs, 3D path)
- Dask chunk boundaries: not applicable, no map_overlap change
- No premature materialization or unnecessary copies
- Benchmark: not needed, no perf-sensitive change
- README feature matrix: not applicable, no new function
- Docstrings present and accurate
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (#2662)
The one nit from the first pass is addressed: the identity path now carries a comment explaining that the explicit _FillValue = nan line covers the explicit-nodata=-only case (where the input had no nodata attrs), while _refresh_nodata_attrs handles inputs that declared nodata / nodatavals. Logic is unchanged.
Also merged origin/main into the branch. The merge was clean (no conflicts); git auto-merged resample.py and test_resample.py. The full resample suite passes (222 tests) after the merge.
No new blockers, suggestions, or nits.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2662
What changed
resampleidentity fast path (scale_factor=1.0) and the 3D dispatch path masked nodata sentinels to NaN but refreshed only_FillValue, leavingattrs['nodata']andattrs['nodatavals']pointing at the original finite sentinel. Output advertisednodata=-9999while the data was already NaN._refresh_nodata_attrshelper and called it from all three paths, so masked-to-NaN output never advertises a stale integer sentinel.Backend coverage
The fix is backend-independent attr handling; the mask itself routes through
xarray.where. Regression tests cover numpy and dask+numpy. cupy and dask+cupy masking is exercised by the existing cross-backend nodata coverage.Test plan
TestIdentityNodataMetadataclass covers: identity path refreshes all three attrs, identity matches the non-identity path, absent attrs stay absent, and the 3D path is fixed too.