Forward allow_rotated/allow_invalid_nodata to VRT per-source reads (#2672)#2676
Conversation
Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.
…2672) The public read_vrt documented allow_rotated and allow_invalid_nodata as opt-ins forwarded to per-source GeoTIFF reads, but the eager and chunked paths only passed the codec flags down. Thread both opt-ins through the internal read_vrt and the chunked _vrt_chunk_read into the per-source read_to_array call so a rotated source TIFF or a source with invalid integer nodata can be read when the caller opts in. allow_unparseable_crs is a validation-layer opt-in only; read_to_array does not accept it and the VRT takes its CRS from the already-validated VRT XML, so it does not need per-source threading.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Forward allow_rotated/allow_invalid_nodata to VRT per-source reads (#2672)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None. The two opt-ins now reach read_to_array on both the eager (_vrt.py per-source read) and chunked (_vrt_chunk_read) paths, and the GPU / dask+GPU VRT paths inherit it because the per-source decode is always the same CPU read_to_array.
Nits (optional improvements)
_vrt_chunk_readin_backends/vrt.pygainedallow_rotated/allow_invalid_nodataparams with no docstring lines, but the existing codec kwargs there are also undocumented, so it matches the surrounding style. A one-line note would mirror the public docstring; optional since the function is internal-only.
What looks good
allow_unparseable_crsis correctly left out of the per-source read:read_to_arraydoesn't accept it, and the VRT CRS comes from the already-validated VRT XML. This matches the dask backend, which also keepsallow_unparseable_crsat the validation layer and forwards onlyallow_rotated/allow_invalid_nodatato the reader.- Both internal
read_vrtcall sites (eager at_backends/vrt.py:540, chunked at:788) are updated; nothing was missed. - The tests prove the fix, not just exercise it. Reverting the source-side forwarding flips the five "with opt-in" cases to failures, and they pass with it. Both rejection types (
RotatedTransformError,InvalidIntegerNodataError) are realValueErrorsubclasses, so themissing_sources='warn'hole-reclassification test hits the actual catch path described in the issue. - Temp filenames carry the issue number (
rotated_src_2672.tif, etc.), and the tests reuse the existing_write_rotated_tiff/_build_uint16_tiff_1774builders so the byte layout stays shared with the non-VRT reader tests.
Backend coverage note
The per-source read is CPU-only (read_to_array) for all four VRT output backends, so numpy / dask+numpy are directly tested and cupy / dask+cupy inherit the same fix. A dedicated cupy test would need a GPU runner; the eager/chunked coverage is the right level here.
Checklist
- Algorithm matches reference/paper: n/a (plumbing fix)
- All implemented backends consistent: yes (shared per-source read)
- NaN handling correct: unchanged; nodata masking path untouched
- Edge cases covered: yes (rotated source, NaN integer nodata, warn-mode hole)
- Dask chunk boundaries: yes (opt-ins forwarded per chunk task)
- No premature materialization or copies: yes
- Benchmark: not needed (no perf-sensitive change)
- README feature matrix: not needed (no new function, no backend change)
- Docstrings: public docstring already documented the forwarding; the fix makes it true
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after nit fix)
The one nit from the previous pass is addressed: _vrt_chunk_read now documents that allow_rotated / allow_invalid_nodata are forwarded to the internal reader so each task's per-source GeoTIFF read honors the opt-ins, matching the eager path.
No new findings. No Blockers, no Suggestions. The 9 tests in test_source_opt_ins_2672.py still pass and flake8 is clean on the changed file. Nothing else to fix.
Closes #2672
The public
read_vrtacceptsallow_rotated,allow_unparseable_crs, andallow_invalid_nodata, and documents the first and third as opt-ins forwarded to the per-source GeoTIFF reads. They weren't actually forwarded. The eager internal call and the chunked per-chunk reader only passed the codec flags down, so a caller who setallow_rotated=Trueorallow_invalid_nodata=Truestill hit a typed rejection on a rotated source TIFF or a source with non-representable integer nodata. Undermissing_sources='warn'that rejection got reclassified as a hole, silently dropping the tile.What changed:
allow_rotatedandallow_invalid_nodatato the internalread_vrtin_vrt.pyand forwarded them to the per-sourceread_to_arraycall._vrt_chunk_read(the chunked per-chunk reader) in_backends/vrt.py.allow_unparseable_crsis a validation-layer opt-in only.read_to_arraydoesn't accept it, and the VRT takes its CRS from the already-validated VRT XML, so it doesn't need per-source threading.Backend coverage: numpy (eager) and dask+numpy (chunked) VRT read paths. The change lives in the shared per-source read plumbing, so the cupy / dask+cupy VRT paths inherit it through the same internal
read_vrt.Test plan:
xrspatial/geotiff/tests/vrt/test_source_opt_ins_2672.pybuild a single-source VRT over a rotated source TIFF and over a uint16 source with NaN GDAL_NODATA. Each case is rejected without the opt-in and reads cleanly with it, on both eager and chunked paths.missing_sources='warn'test confirms the opt-in keeps a readable source out of the hole bucket instead of emitting a fallback warning and recording a false hole.No CHANGELOG edit (updated separately at release time). No new public API: the opt-ins already existed and were documented; this makes them take effect.