Skip to content

Forward allow_rotated/allow_invalid_nodata to VRT per-source reads (#2672)#2676

Merged
brendancol merged 4 commits into
mainfrom
issue-2672
May 29, 2026
Merged

Forward allow_rotated/allow_invalid_nodata to VRT per-source reads (#2672)#2676
brendancol merged 4 commits into
mainfrom
issue-2672

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2672

The public read_vrt accepts allow_rotated, allow_unparseable_crs, and allow_invalid_nodata, and documents the first and third as opt-ins forwarded to the per-source GeoTIFF reads. They weren't actually forwarded. The eager internal call and the chunked per-chunk reader only passed the codec flags down, so a caller who set allow_rotated=True or allow_invalid_nodata=True still hit a typed rejection on a rotated source TIFF or a source with non-representable integer nodata. Under missing_sources='warn' that rejection got reclassified as a hole, silently dropping the tile.

What changed:

  • Added allow_rotated and allow_invalid_nodata to the internal read_vrt in _vrt.py and forwarded them to the per-source read_to_array call.
  • Forwarded both opt-ins from the eager backend call and from _vrt_chunk_read (the chunked per-chunk reader) in _backends/vrt.py.
  • allow_unparseable_crs is a validation-layer opt-in only. read_to_array doesn't accept it, and the VRT takes its CRS from the already-validated VRT XML, so it doesn't need per-source threading.

Backend coverage: numpy (eager) and dask+numpy (chunked) VRT read paths. The change lives in the shared per-source read plumbing, so the cupy / dask+cupy VRT paths inherit it through the same internal read_vrt.

Test plan:

  • New tests in xrspatial/geotiff/tests/vrt/test_source_opt_ins_2672.py build a single-source VRT over a rotated source TIFF and over a uint16 source with NaN GDAL_NODATA. Each case is rejected without the opt-in and reads cleanly with it, on both eager and chunked paths.
  • A missing_sources='warn' test confirms the opt-in keeps a readable source out of the hole bucket instead of emitting a fallback warning and recording a false hole.
  • Reverting the source-side forwarding makes the five "with opt-in" cases fail, which confirms the tests exercise the fix.
  • Full VRT suite (491 tests) plus the read nodata/georef suites pass; flake8 clean on the changed files.

No CHANGELOG edit (updated separately at release time). No new public API: the opt-ins already existed and were documented; this makes them take effect.

Dedupe duplicate module rows (last-write-wins by last_inspected) and
collapse multi-line notes to single physical lines. The notes had
embedded newlines, which the merge=union .gitattributes strategy splits
record-by-record, corrupting the file into a 156-column phantom row on
parallel-agent appends. One line per record keeps union merges safe.
…2672)

The public read_vrt documented allow_rotated and allow_invalid_nodata as
opt-ins forwarded to per-source GeoTIFF reads, but the eager and chunked
paths only passed the codec flags down. Thread both opt-ins through the
internal read_vrt and the chunked _vrt_chunk_read into the per-source
read_to_array call so a rotated source TIFF or a source with invalid
integer nodata can be read when the caller opts in.

allow_unparseable_crs is a validation-layer opt-in only; read_to_array
does not accept it and the VRT takes its CRS from the already-validated
VRT XML, so it does not need per-source threading.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Forward allow_rotated/allow_invalid_nodata to VRT per-source reads (#2672)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None. The two opt-ins now reach read_to_array on both the eager (_vrt.py per-source read) and chunked (_vrt_chunk_read) paths, and the GPU / dask+GPU VRT paths inherit it because the per-source decode is always the same CPU read_to_array.

Nits (optional improvements)

  • _vrt_chunk_read in _backends/vrt.py gained allow_rotated / allow_invalid_nodata params with no docstring lines, but the existing codec kwargs there are also undocumented, so it matches the surrounding style. A one-line note would mirror the public docstring; optional since the function is internal-only.

What looks good

  • allow_unparseable_crs is correctly left out of the per-source read: read_to_array doesn't accept it, and the VRT CRS comes from the already-validated VRT XML. This matches the dask backend, which also keeps allow_unparseable_crs at the validation layer and forwards only allow_rotated / allow_invalid_nodata to the reader.
  • Both internal read_vrt call sites (eager at _backends/vrt.py:540, chunked at :788) are updated; nothing was missed.
  • The tests prove the fix, not just exercise it. Reverting the source-side forwarding flips the five "with opt-in" cases to failures, and they pass with it. Both rejection types (RotatedTransformError, InvalidIntegerNodataError) are real ValueError subclasses, so the missing_sources='warn' hole-reclassification test hits the actual catch path described in the issue.
  • Temp filenames carry the issue number (rotated_src_2672.tif, etc.), and the tests reuse the existing _write_rotated_tiff / _build_uint16_tiff_1774 builders so the byte layout stays shared with the non-VRT reader tests.

Backend coverage note

The per-source read is CPU-only (read_to_array) for all four VRT output backends, so numpy / dask+numpy are directly tested and cupy / dask+cupy inherit the same fix. A dedicated cupy test would need a GPU runner; the eager/chunked coverage is the right level here.

Checklist

  • Algorithm matches reference/paper: n/a (plumbing fix)
  • All implemented backends consistent: yes (shared per-source read)
  • NaN handling correct: unchanged; nodata masking path untouched
  • Edge cases covered: yes (rotated source, NaN integer nodata, warn-mode hole)
  • Dask chunk boundaries: yes (opt-ins forwarded per chunk task)
  • No premature materialization or copies: yes
  • Benchmark: not needed (no perf-sensitive change)
  • README feature matrix: not needed (no new function, no backend change)
  • Docstrings: public docstring already documented the forwarding; the fix makes it true

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after nit fix)

The one nit from the previous pass is addressed: _vrt_chunk_read now documents that allow_rotated / allow_invalid_nodata are forwarded to the internal reader so each task's per-source GeoTIFF read honors the opt-ins, matching the eager path.

No new findings. No Blockers, no Suggestions. The 9 tests in test_source_opt_ins_2672.py still pass and flake8 is clean on the changed file. Nothing else to fix.

@brendancol brendancol merged commit 5200e73 into main May 29, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

VRT source-read opt-ins (allow_rotated, allow_invalid_nodata) are accepted but not forwarded to per-source reads

1 participant