Forward caller chunking to the coregister/auto_reproject output#3236
Merged
Conversation
brendancol
commented
Jun 11, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Forward caller chunking to the coregister/auto_reproject output
Blockers (must fix before merge)
-
xrspatial/accessor.py:323—explicit_chunks = kwargs.get('chunks')captures the raw kwarg and forwards it toreproject(chunk_size=...)unconverted._validate_chunks_argdeliberately acceptsnp.integerscalars for the read path and coerces them, but_compute_chunk_layoutonly special-cases builtinint, sochunks=np.int64(2048)withcoregister=Truenow raisesTypeError: cannot unpack non-iterable numpy.int64 object. Before this PR that call worked (the read coerced it; reproject never saw it). Coerce integer-likes withint(...)before forwarding, or run the captured value through_validate_chunks_arg(..., allow_none=True).
Suggestions (should fix, not blocking)
-
xrspatial/tests/test_open_geotiff_coregister.py— no test covers a tuplechunks=(3, 2)kwarg flowing through to the reproject output. It is the same forwarding path, but the tuple form is the one_compute_chunk_layoutunpacks directly, so a one-line assertion would pin it down.
Nits (optional improvements)
-
xrspatial/accessor.py:980-983— the new Returns sentence says a dask result chunks "likeself", but the windowed read only infers the y-axis chunk (line 328), so a plain read against a template chunked (3, 2) comes back (3, 3). The exact (y, x) match only holds for the reproject output. Scoping the sentence to the reproject step would keep the docstring honest.
What looks good
- Capturing
explicit_chunksbefore thesetdefaultis the right move; readingkwargs['chunks']afterward would have conflated user intent with the inferred value. - The asymmetric (3, 2) chunk test pins the (y, x) ordering, which is exactly the kind of thing that silently transposes.
- Forwarding into both the
coregisterandauto_reprojectbranches fixes the same gap in one pass instead of leaving the second branch for a follow-up. Nonefalls through to reproject's existing defaults, so numpy callers without achunks=kwarg see no behavior change.
Checklist
- Algorithm matches reference/paper (n/a — chunk plumbing only)
- All implemented backends produce consistent results (CPU-only coregister guard unchanged; dask+cupy auto_reproject forwards the same value)
- NaN handling is correct (untouched; nodata_arg path unchanged)
- Edge cases are covered by tests (np.integer and tuple chunks kwarg are not)
- Dask chunk boundaries handled correctly (reproject computes per-chunk source windows; chunk size only changes tiling)
- No premature materialization or unnecessary copies
- Benchmark exists or is not needed (not needed)
- README feature matrix updated (n/a)
- Docstrings present and accurate (modulo the nit above)
brendancol
commented
Jun 11, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review (after b4c66f4)
All three findings from the first pass are addressed:
- Blocker (np.integer chunks crash): fixed. The captured kwarg now goes through
_validate_chunks_arg(..., allow_none=True)atxrspatial/accessor.py:325, which coercesnp.integerscalars to builtin int before the value reaches_compute_chunk_layout. Verifiedchunks=np.int64(2)withcoregister=Truereturns chunksize (2, 2) in the new test. - Suggestion (tuple chunks coverage): fixed.
test_coregister_chunks_kwarg_tuple_and_np_integercovers both the(row, col)tuple form and the np scalar form. - Nit (docstring overstatement): fixed. The Returns sentence now scopes the chunk-matching claim to the reproject step.
One side effect worth noting, not a problem: invalid chunks values (e.g. a string) now raise from the accessor before the read starts, instead of from inside the read. The error message is identical because both call the same validator, so callers just see it earlier.
No new issues. 91 tests pass across the coregister, resampling, and accessor suites; flake8 is clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3234
.xrs.open_geotiffnow passeschunk_sizeto thereproject()call used bycoregister=Trueandauto_reproject=True, so a dask result keeps the caller's chunk layout (or the explicitchunks=kwarg when one is given) instead of reverting to reproject's 512x512 default._infer_caller_y_chunkto_infer_caller_chunk(obj, axis)so the x-axis chunk can be inferred too. Single call site, no public API change.Backend coverage: the fix sits in the shared windowed-read path.
coregisteris CPU-only by design (numpy / dask+numpy); theauto_reprojectchunk forwarding also covers dask+cupy callers.Test plan:
test_coregister_dask_template_keeps_caller_chunks: dask template chunked (3, 2) gives output chunksize (3, 2), asymmetric to prove (y, x) orderingtest_coregister_explicit_chunks_kwarg: numpy template +chunks=2gives output chunksize (2, 2)test_auto_reproject_dask_template_keeps_caller_chunks: auto_reproject output follows caller chunkstest_open_geotiff_coregister.py,test_open_geotiff_resampling.py,test_accessor.pyall pass (90 tests)