The to_geotiff docstring says dask-backed DataArrays are "written in streaming mode: one tile-row at a time, without materialising the full array into RAM" (_writers/eager.py:102-109). That is not what happens for dask+cupy input.
_is_gpu_data detects dask-of-cupy via _meta (_backends/_gpu_helpers.py:23-37), and the use_gpu dispatch at eager.py:633/757 runs before the dask streaming branch at 918. So dask+cupy auto-routes to _write_geotiff_gpu, which calls data.compute() (_writers/gpu.py:565) and materializes the whole array on device. Verified by execution: the write succeeds eagerly with no warning, and streaming_buffer_bytes does nothing on this path.
The no-op is documented only in the private _write_geotiff_gpu docstring (gpu.py:207-210). The public docs say streaming_buffer_bytes is "Only relevant for dask-backed inputs", which is exactly the input type where it gets ignored.
For out-of-core GPU pipelines this defeats the point of chunking. Short term, the public docstring should state that dask+cupy writes materialize on device and that streaming_buffer_bytes does not apply; longer term the GPU writer could stream per block. Related to the gpu=False escape hatch being broken, filed separately.
The
to_geotiffdocstring says dask-backed DataArrays are "written in streaming mode: one tile-row at a time, without materialising the full array into RAM" (_writers/eager.py:102-109). That is not what happens for dask+cupy input._is_gpu_datadetects dask-of-cupy via_meta(_backends/_gpu_helpers.py:23-37), and theuse_gpudispatch ateager.py:633/757runs before the dask streaming branch at918. So dask+cupy auto-routes to_write_geotiff_gpu, which callsdata.compute()(_writers/gpu.py:565) and materializes the whole array on device. Verified by execution: the write succeeds eagerly with no warning, andstreaming_buffer_bytesdoes nothing on this path.The no-op is documented only in the private
_write_geotiff_gpudocstring (gpu.py:207-210). The public docs saystreaming_buffer_bytesis "Only relevant for dask-backed inputs", which is exactly the input type where it gets ignored.For out-of-core GPU pipelines this defeats the point of chunking. Short term, the public docstring should state that dask+cupy writes materialize on device and that
streaming_buffer_bytesdoes not apply; longer term the GPU writer could stream per block. Related to the gpu=False escape hatch being broken, filed separately.