Place non-georeferenced VRT tiles by explicit pixel offsets (#3116)#3135
Merged
Conversation
to_geotiff(da, 'out.vrt', tile_size=N) on a non-georeferenced array spanning more than one tile wrote a corrupt index: placement came from each tile's GeoTransform, and non-georef tiles all carry the default identity transform, so every DstRect landed at the origin and the mosaic shrank to one tile. Reading the VRT back silently returned a single tile's data with wrong dims and coords. _write_vrt_tiled now records each tile's pixel offsets and passes them through _build_vrt to write_vrt via an internal dst_offsets kwarg when the input has no georeferencing. write_vrt refuses more than one all-non-georef source without explicit placement, and rejects dst_offsets alongside georeferenced sources (those place via their GeoTransform). Georeferenced placement math is unchanged. Also commits the metadata-sweep state row for the geotiff module.
brendancol
commented
Jun 9, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Place non-georeferenced VRT tiles by explicit pixel offsets (#3116)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
xrspatial/geotiff/_vrt.py(dst_offsets docstring): the contract doesn't say what happens when explicit offsets overlap or leave gaps. The only production caller (_write_vrt_tiled) always produces a full non-overlapping cover, but a direct internal caller gets no validation and no documented behaviour. A one-line note ("offsets are not checked for overlap or full coverage") would pin that down.
Nits (optional improvements)
-
tests/vrt/test_non_georef_placement_3116.py:92—open(vrt_path).read()leaks the file handle;pathlib.Path(vrt_path).read_text()avoids the ResourceWarning under stricter warning filters. -
tests/vrt/test_non_georef_placement_3116.py:222— thehash(str(bad)) & 0xfffffilename suffix isn't needed: pytest gives each parametrized case its owntmp_path, so a fixed name can't collide. The hash also varies between runs (string hashing is salted), which makes failure artifacts harder to compare.
What looks good
- The placement refactor is value-preserving for georeferenced sources: the per-source offset math moved out of the XML loop verbatim, and
test_georef_multi_tile_placement_unchangedplus the existing 520-test VRT suite pin it. - The fail-closed gate for multiple non-georef sources without offsets matches the
_check_no_mixed_georefprecedent, and the error message names the recovery path. dst_offsetsis rejected alongside georeferenced sources instead of being silently ignored, so the two placement mechanisms can't disagree.- The bool/float/negative/arity rejections on offset pairs are each covered by a parametrized test, and the
True-as-int case is handled explicitly. - Round-trip coverage spans all four reader backends, plus the dask-backed and plain-ndarray write branches, which share the fixed loop.
- The XML-level test asserts the exact
DstRectoffsets and mosaic size, so a placement regression can't hide behind a reader that compensates.
Checklist
- Algorithm matches reference (VRT pixel-space DstRect placement; verified against the emitted XML)
- All implemented backends produce consistent results (4-backend round-trip test)
- NaN handling is correct (not applicable; writer-side index placement, no pixel math)
- Edge cases are covered by tests (single tile, ragged last tile row/column, plain ndarray, dask chunk grid)
- Dask chunk boundaries handled correctly (offsets follow the chunk grid, not tile_size)
- No premature materialization or unnecessary copies (metadata-only pass unchanged)
- Benchmark exists or is not needed (writer bug fix; not needed)
- README feature matrix updated (not applicable; no new function or backend change)
- Docstrings present and accurate (modulo the overlap/gap note above)
brendancol
commented
Jun 9, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
Follow-up review after e0da0b1.
All three findings from the first pass are addressed:
- dst_offsets docstring now states that offsets are not checked for overlap or coverage and that direct callers own their layout (fixed).
- XML assertion test reads via pathlib.Path.read_text, no leaked handle (fixed).
- The salted-hash filename suffix in the bad-pair test is gone; tmp_path already isolates parametrized cases (fixed).
Test suite re-run after the changes: 18/18 in test_non_georef_placement_3116.py, flake8 and isort clean. Nothing further from me.
test_read_geotiff_dask_handles_vrt_directly builds a 2-source VRT from non-georeferenced tiles, which the new placement guard now refuses without dst_offsets. The test only verifies that _read_geotiff_dask routes .vrt sources to the VRT reader, so spell out the side-by-side layout its comment already described. Before the guard, this setup produced the collapsed index #3116 fixed; the test passed only because it asserts dims alone.
Contributor
Author
|
@copilot resolve the merge conflicts in this pull request |
…eotiff-2026-06-09 # Conflicts: # .claude/sweep-metadata-state.csv
Contributor
Merged |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
to_geotiff(da, 'out.vrt', tile_size=N)on a non-georeferenced array spanning more than one tile wrote a corrupt index. Placement was derived from each tile's GeoTransform, and non-georef tiles all carry the default identity transform, so everyDstRectlanded at the origin andrasterXSize/rasterYSizeshrank to a single tile. Reading the VRT back silently returned one tile's data (a 24x32 input came back as 16x16). Gap left by #2971, whose tests only covered a single non-georef source.Proposed Changes
_write_vrt_tiledrecords each tile's pixel offsets and passes them through_build_vrttowrite_vrtvia an internaldst_offsetskwarg when the input has no georeferencing. The dask-backed and plain-ndarray.vrtwrites go through the same loop, so both are covered.write_vrtrefuses more than one all-non-georef source without explicit placement (identity transforms cannot place the tiles), and rejectsdst_offsetsalongside georeferenced sources, which place via their GeoTransform. Thedst_offsetsdocstring states that offsets are not checked for overlap or full coverage; direct callers own their layout.origin/mainto bring the branch up to date; the only conflict was in.claude/sweep-metadata-state.csv(HEAD geotiff row kept, new mcda row from main included).Backend coverage: the writer fix is backend-independent; the round-trip test reads the repaired VRT on numpy, cupy, dask+numpy, and dask+cupy.
Test plan:
tests/vrt/test_non_georef_placement_3116.py: 4-backend round trip, dask-backed write, plain-ndarray write, single-tile no-op, XMLDstRectassertions, georef placement regression guard, and thewrite_vrterror contract (ambiguous placement, georef + offsets, length mismatch, malformed pairs).Also carries the metadata-sweep state CSV row for the geotiff module (sweep bookkeeping).