Describe the bug
open_geotiff(path, chunks=N, max_pixels=M) rejects the read up front when the full image extent exceeds max_pixels, even though dask chunks only materialise one block at a time. The CPU dask reader (_backends/dask.py) and the GPU+dask reader (_backends/gpu.py::_read_geotiff_gpu_chunked) both run _check_dimensions against the full windowed region before any task is scheduled.
That forces callers who want to read a large raster via dask to set max_pixels above the full image size, which then disables the per-chunk safety guard for every task too. The whole point of chunks= is bounded peak memory, so max_pixels should track the chunk size, not the file size.
Expected behavior
When chunks is supplied, max_pixels should apply to each chunk's materialised buffer, not the full image. The eager (no-chunks) path keeps applying max_pixels to the full windowed region.
On a 100x100 single-band TIFF:
open_geotiff(path, chunks=10, max_pixels=200) should succeed. Each 10x10 chunk is 100 pixels, under the cap.
open_geotiff(path, chunks=20, max_pixels=200) should raise PixelSafetyLimitError when a chunk task runs. 20x20 is 400, over the cap.
Today the first case raises up front because 100*100 = 10000 > 200. Callers have to raise max_pixels to at least 10000 to read it at all, which then admits any chunk size.
Additional context
The per-chunk check is already wired: _delayed_read_window forwards max_pixels to _read_to_array, which runs _check_dimensions(out_w, out_h, samples, max_pixels) against the chunk window. The fix is to drop the redundant full-extent guard in the chunked paths and update the docstrings and tests that lock in the old contract.
Scope: CPU dask path and GPU+dask path. The VRT chunked path is out of scope; VRT mosaicing has its own composition rules.
Describe the bug
open_geotiff(path, chunks=N, max_pixels=M)rejects the read up front when the full image extent exceedsmax_pixels, even though dask chunks only materialise one block at a time. The CPU dask reader (_backends/dask.py) and the GPU+dask reader (_backends/gpu.py::_read_geotiff_gpu_chunked) both run_check_dimensionsagainst the full windowed region before any task is scheduled.That forces callers who want to read a large raster via dask to set
max_pixelsabove the full image size, which then disables the per-chunk safety guard for every task too. The whole point ofchunks=is bounded peak memory, somax_pixelsshould track the chunk size, not the file size.Expected behavior
When
chunksis supplied,max_pixelsshould apply to each chunk's materialised buffer, not the full image. The eager (no-chunks) path keeps applyingmax_pixelsto the full windowed region.On a 100x100 single-band TIFF:
open_geotiff(path, chunks=10, max_pixels=200)should succeed. Each 10x10 chunk is 100 pixels, under the cap.open_geotiff(path, chunks=20, max_pixels=200)should raisePixelSafetyLimitErrorwhen a chunk task runs. 20x20 is 400, over the cap.Today the first case raises up front because
100*100 = 10000 > 200. Callers have to raisemax_pixelsto at least 10000 to read it at all, which then admits any chunk size.Additional context
The per-chunk check is already wired:
_delayed_read_windowforwardsmax_pixelsto_read_to_array, which runs_check_dimensions(out_w, out_h, samples, max_pixels)against the chunk window. The fix is to drop the redundant full-extent guard in the chunked paths and update the docstrings and tests that lock in the old contract.Scope: CPU dask path and GPU+dask path. The VRT chunked path is out of scope; VRT mosaicing has its own composition rules.