diff --git a/.claude/sweep-test-coverage-state.csv b/.claude/sweep-test-coverage-state.csv index 4f0739c6d..01c3bfe66 100644 --- a/.claude/sweep-test-coverage-state.csv +++ b/.claude/sweep-test-coverage-state.csv @@ -9,7 +9,7 @@ interpolate_spline,2026-06-04,,HIGH,1;3;5,scope=spline-only; cupy+dask_cupy spli polygonize,2026-05-29,2623,MEDIUM,4,"Pass 3 (2026-05-29): added test_polygonize_mask_dtype_coverage_2026_05_29.py (41 passed, 8 xfailed on a CUDA host). Closes Cat 4 MEDIUM parameter-coverage gap: mask= is documented to accept bool/integer/float values but every prior test passed only a bool mask. Integer masks (int32/int64) now pinned against the same-backend bool-mask output on all four backends x both raster dtypes x connectivity 4/8; float-mask-on-integer-raster also pinned. Each backend is compared to its OWN bool reference to isolate mask-dtype from the unrelated numpy-vs-dask hole-vs-single-ring representation difference. Mutation (drop the not-mask[ij] exclusion in _calculate_regions) flips 11 tests red incl. the pixel-exclusion sanity anchor; clean md5 restore. Surfaced source bug #2623: a float-dtype mask on a float-dtype raster raises TypeError at polygonize.py:918 (mask & nan_mask; bitwise_and undefined for float&bool; cupy/dask route floats through _polygonize_numpy so they crash too; int masks coerce fine). 8 float-mask cases marked xfail(strict, raises=TypeError) referencing #2623. Test-only; source untouched. | Pass 2 (2026-05-27): added test_polygonize_atol_rtol_backend_coverage_2026_05_27.py with 15 tests, all passing on a CUDA host. Closes Cat 4 MEDIUM parameter-coverage gap on atol/rtol forwarding through the cupy and dask+cupy backends. atol/rtol were exposed by #2173 / #2194 and thread through _polygonize_cupy (polygonize.py:808) and _polygonize_dask (polygonize.py:1719); the dask path further plumbs them into dask.delayed(_polygonize_chunk)(...) at lines 1748-1754 and into _bucket_key_for_value for cross-chunk merge bucketing at lines 1757-1758. Pre-existing tests covered non-default atol/rtol only on numpy and dask+numpy. The cupy and dask+cupy dispatchers were untested -- a regression dropping the kwargs there would silently change the float polygon count and would not be caught. Same dispatcher-silently-drops-kwarg pattern fixed by #1561 / #1605 / #1685 / #1810 / #1974 on adjacent GeoTIFF surfaces. 15 tests: cupy strict-equality + default-tolerance pin on _REPRO_2173, dask+cupy strict-equality single-chunk + multi-chunk (engages cross-chunk merge bucket) + default-tolerance multi-chunk pin, cupy intermediate-atol small/large pair, dask+cupy intermediate-atol single/multi-chunk small + single-chunk large, cupy integer atol-ignored matrix, dask+cupy integer atol-ignored single-chunk + multi-chunk, cupy rtol-only large/small matrix. Mutation against _polygonize_cupy float branch (drop atol/rtol kwargs in the _polygonize_numpy forward call at polygonize.py:823-825) flips 3 of 5 cupy tests red; mutation against dask.delayed(_polygonize_chunk)(...) at polygonize.py:1748-1754 (drop atol, rtol args) flips 2 of 6 dask+cupy tests red. Confirmed clean restore via md5sum. Source untouched. Filed issue #2537 (test-only). Cat 4 MEDIUM (parameter coverage on cupy + dask+cupy atol/rtol forwarding). Pass 1 (2026-05-19): added test_polygonize_coverage_2026_05_19.py with 58 tests, all passing on a CUDA host. Closes Cat 3 HIGH 1x1 / Nx1 single-column geometric gaps (Nx1 exercises the nx==1 padding path at polygonize.py:565 and the cupy nx==1 numpy-fallback at polygonize.py:671), Cat 3 MEDIUM 1xN single-row and all-equal-value rasters on all four backends. Closes Cat 2 HIGH NaN parity for cupy + dask+cupy (numpy/dask were already covered by test_polygonize_nan_pixels_excluded*), Cat 2 MEDIUM all-NaN raster on all four backends, Cat 2 HIGH +/-Inf pins on all four backends. Filed source-bug issue #2155: numpy/dask/dask+cupy backends silently absorb Inf cells into adjacent finite polygons because _is_close reduces abs(inf-inf) to nan; cupy backend handles Inf correctly. Pins lock the asymmetric behaviour so the fix is visible. Closes Cat 1 MEDIUM simplify_tolerance + mask= parity gaps on dask+cupy backend (numpy/cupy/dask were already covered). Closes Cat 4 MEDIUM column_name non-default value across geopandas/spatialpandas/geojson return types and Cat 4 MEDIUM validation error paths (bad connectivity, bad transform length, mask shape mismatch, mask underlying-type mismatch). Cat 5 N/A: polygonize returns lists/dataframes, not a DataArray with attrs to propagate." proximity,2026-06-02,2692,HIGH,1;2;3;4;5,"Pass 2 (2026-06-02): added 18 tests to test_proximity.py closing the two MEDIUM gaps Pass 1 left open, all RUN and passing on a CUDA host across numpy/cupy/dask+numpy/dask+cupy (15 cross-backend + 3 error-path). Source untouched. Cat 4 MEDIUM (error path): _process raises ValueError when raster.dims != (y, x) (proximity.py:1043) but no test exercised the swapped x/y guard; test_wrong_dim_order_raises pins it for proximity/allocation/direction. Cat 2 MEDIUM (all-NaN input): Pass 1 noted all-NaN/all-zero on eager numpy+cupy was unpinned; test_all_nan_raster_all_nan_output pins an all-NaN 6x6 raster -> all-NaN float32 output on all four backends x three functions. Remaining LOW (documented): invalid distance_metric string silently falls back to EUCLIDEAN (proximity.py:1049-1051). || PREVIOUS: Pass 1 (2026-05-29): added 65 tests to test_proximity.py closing three coverage gaps, all RUN and passing on a CUDA host (numpy/cupy/dask+numpy/dask+cupy). Issue #2692, PR opened. Source untouched. Cat 3 HIGH: degenerate raster shapes (1x1 single pixel, Nx1 column strip, 1xN row strip) had zero coverage for proximity/allocation/direction on any backend; they stress the line-sweep kernel boundaries (_process_proximity_line) and the GPU brute-force kernel grid sizing (_proximity_cuda_kernel via cuda_args). Pinned all three shapes x three functions x four backends against hand-checked expected values; mutation of a pinned direction expectation confirms teeth. Cat 1/4 HIGH: allocation and direction only ran EUCLIDEAN across backends; MANHATTAN and GREAT_CIRCLE were cross-backend-tested for proximity only. Pinned both metrics x two functions x four backends against the numpy baseline (all match). Cat 5 MEDIUM: no test set non-empty res/crs attrs so the attrs-preservation assertion in general_output_checks compared two empty dicts. proximity reads attrs['res'] via get_dataarray_resolution for bounded-dask chunk padding, so added attrs round-trip tests on four backends plus a bounded-dask test where a res attr matching the coordinate spacing must equal the numpy baseline. A res attr that lies about the spacing mis-sizes the map_overlap depth; source fragility, not a test gap, left for a separate accuracy issue. Cat 2 (NaN/Inf input) already covered by the shared test_raster fixture (embeds np.inf and np.nan, runs on four backends). Remaining LOW: all-NaN / all-zero input on eager numpy+cupy not directly pinned." rasterize,2026-05-29,2614,MEDIUM,4,"Pass 4 (2026-05-29): added test_rasterize_coverage_2026_05_29.py with 11 tests, all passing (pure-Python validation paths, no CUDA needed); filed issue #2614 and opened a test-only PR. Closes Cat 4 MEDIUM error-path gaps that all three prior passes left untouched. (1) Partial width/height: the (width is None) != (height is None) guard in rasterize() raises ValueError naming the given and missing dimension, documented in the docstring, but neither the width-only nor height-only branch had a test; pin both directions plus the width-only+resolution case proving the guard fires before the resolution branch. (2) resolution= input type/shape validation: the type/shape branches (non-number/non-sequence string|dict; wrong-ndim numpy array; wrong-length sequence len 1|3|4; non-numeric elements) had no coverage -- test_rasterize.py's test_invalid_resolution_scalar/tuple only exercise non-finite/non-positive VALUES, not these type/shape guards, so a regression loosening or reordering them would ship silently; pin each branch to its message plus a positive control that a 1-D length-2 numpy array is still accepted. Source untouched." -reproject,2026-06-08,2618;3050,HIGH,3;4,"Pass 2026-06-08 (deep-sweep test-coverage): #3050 closes the one live gap found this pass. reproject()'s dask+cupy backend was parity-tested only with resampling='cubic' (TestCupyPyprojFallbackParity::test_projected_to_projected_dask_cupy_match); nearest/bilinear were covered on numpy (end-to-end) and eager cupy (parametrized test_projected_to_projected_numpy_cupy_match) but never on the dask+cupy chunk-assembly path. Parametrized that test over ['nearest','bilinear','cubic']; all 3 RUN+PASS on a CUDA host. Cat 4 MEDIUM (resampling-mode parameter coverage on the dask+cupy backend). Test-only, source untouched. Re-confirmed _merge.merge() has NO genuine cupy/dask+cupy backend (_merge_inmemory/_merge_dask use _merge_arrays_numpy + raster.values; _merge_arrays_cupy is imported but never dispatched = dead code, not a test gap) matching the prior pass's observation. reproject() otherwise saturated across all 4 backends, NaN/Inf/all-NaN, degenerate shapes, metadata, vertical, bounds_policy, integer nodata. LOW (documented, not filed): dask+cupy resampling-mode parity is the only per-mode-per-backend cell that had been missing. || PREVIOUS: Pass 2026-05-29: reproject already has a deep suite (369 tests in test_reproject.py + coverage/gate files) covering all 4 backends, NaN/Inf/all-NaN/all-Inf, 1x1/2x2, metadata, vertical shift, bounds_policy x backends, integer nodata x backends. Gaps found: Cat 3 HIGH single-row (1xN) and single-col (Nx1) strip rasters never tested (hit size<2 branch of _validate_regular_axis + degenerate resampling axis); Cat 3 MEDIUM constant-value/zero-gradient raster never reprojected. Added TestDegenerateShapeReproject (12 tests): 1xN+Nx1 strips x numpy/dask/cupy/dask+cupy, constant raster numpy value-preservation + cross-backend parity. All 12 executed and passed on a CUDA host. Test-only, no source change (#2618). LOW (documented only): _merge._merge_arrays_cupy imported but never called by merge() (host-bounces via _merge_arrays_numpy) - dead-code source observation not a test gap; non-square cellsize reproject only covered via resolution-tuple validation errors not a successful anisotropic run." +reproject,2026-06-09,2618;3050;3100;3101;3141,MEDIUM,1,"CI follow-up same day: first CI run of the threaded streaming branch hard-crashed macos-arm64 py3.14 (SIGABRT in numba call_cfunc, two ThreadPoolExecutor threads concurrently inside try_numba_transform/tmerc_inverse) -- the projection kernels are @njit(parallel=True) and numba's workqueue threading layer aborts on concurrent entry; filed source bug #3141. Test fix: threaded parity test now uses transform_precision=0 (per-thread pyproj Transformer, no numba), the NaN multi-tile test and 3-D xfail forced serial (max_memory=1) so the numba fast path stays covered without concurrent entry. windows-3.14 failure was fail-fast collateral (its suite fully passed). Pass 2026-06-09 (deep-sweep test-coverage): delta re-sweep one day after the 2026-06-08 pass; module modified today by #3077 (datum-probe warning silencing) and #3081 (merge output-size guard backend-aware) -- both landed WITH their own tests (TestDatumProbeNoProjWarning; TestSecurityGuards merge-guard trio incl. the monkeypatched in-memory raise), so the delta added no gap; the guard branching is is_dask-only, so cupy eager shares the tested numpy branch (no per-backend guard test needed). Found one MEDIUM Cat 1 gap every prior pass missed: the 5th dispatch branch of reproject() -- the streaming fallback (_reproject_streaming / _process_tile_batch / _parse_max_memory, taken when source >512MB and dask is not importable) -- had zero coverage anywhere; _parse_max_memory only runs on that branch so the existing max_memory kwarg tests never reached it. Filed #3101, added test_reproject_streaming_3101.py (15 tests: parity vs in-memory numpy for threaded / serial(max_memory=1) / single-tile / nearest+NaN, plus 10 _parse_max_memory unit cases). Probe surfaced source bug #3100: streaming assembly allocates a 2-D output buffer but 3-D sources yield (h,w,b) tiles -> ValueError broadcast in both assembly loops; pinned with strict xfail, source fix left to #3100 (test-only PR, source untouched). CPU-only path so no GPU tests needed (CUDA host; file ran 14 passed + 1 xfailed). LOW carried (documented, not fixed): reproject(name=) / merge(name=) override values untested (only merge name fallback covered); non-square-cellsize successful anisotropic run; dask.bag distributed branch of _reproject_streaming still unexercised (needs a live distributed client). || PREVIOUS: Pass 2026-06-08 (deep-sweep test-coverage): #3050 closes the one live gap found this pass. reproject()'s dask+cupy backend was parity-tested only with resampling='cubic' (TestCupyPyprojFallbackParity::test_projected_to_projected_dask_cupy_match); nearest/bilinear were covered on numpy (end-to-end) and eager cupy (parametrized test_projected_to_projected_numpy_cupy_match) but never on the dask+cupy chunk-assembly path. Parametrized that test over ['nearest','bilinear','cubic']; all 3 RUN+PASS on a CUDA host. Cat 4 MEDIUM (resampling-mode parameter coverage on the dask+cupy backend). Test-only, source untouched. Re-confirmed _merge.merge() has NO genuine cupy/dask+cupy backend (_merge_inmemory/_merge_dask use _merge_arrays_numpy + raster.values; _merge_arrays_cupy is imported but never dispatched = dead code, not a test gap) matching the prior pass's observation. reproject() otherwise saturated across all 4 backends, NaN/Inf/all-NaN, degenerate shapes, metadata, vertical, bounds_policy, integer nodata. LOW (documented, not filed): dask+cupy resampling-mode parity is the only per-mode-per-backend cell that had been missing. || PREVIOUS: Pass 2026-05-29: reproject already has a deep suite (369 tests in test_reproject.py + coverage/gate files) covering all 4 backends, NaN/Inf/all-NaN/all-Inf, 1x1/2x2, metadata, vertical shift, bounds_policy x backends, integer nodata x backends. Gaps found: Cat 3 HIGH single-row (1xN) and single-col (Nx1) strip rasters never tested (hit size<2 branch of _validate_regular_axis + degenerate resampling axis); Cat 3 MEDIUM constant-value/zero-gradient raster never reprojected. Added TestDegenerateShapeReproject (12 tests): 1xN+Nx1 strips x numpy/dask/cupy/dask+cupy, constant raster numpy value-preservation + cross-backend parity. All 12 executed and passed on a CUDA host. Test-only, no source change (#2618). LOW (documented only): _merge._merge_arrays_cupy imported but never called by merge() (host-bounces via _merge_arrays_numpy) - dead-code source observation not a test gap; non-square cellsize reproject only covered via resolution-tuple validation errors not a successful anisotropic run." resample,2026-05-29,2547;2615,HIGH,1;2;3;5,"Pass 2 (2026-05-29): added test_resample_cupy_agg_fallback_2615.py (6 tests, all passing on CUDA host). Closes Cat 1 MEDIUM backend-coverage gap: the cupy eager aggregate CPU fallback for average/min/max at a NON-integer downsample factor (_run_cupy fy==int(fy) branch in resample.py ~L957-973) was never exercised; existing TestCuPyParity used 12x12 scale 0.5 (integer factor 2 -> GPU reshape path) and only median/mode hit the host fallback. New tests use 10x10 scale 0.3 (factor 3.33) for average/min/max parity vs numpy plus a NaN-masked variant. Issue #2615. Module is otherwise very thoroughly covered (test_resample.py + 3 supplementary files); no remaining HIGH gaps found. Pass 1 (2026-05-27): added test_resample_coverage_2026_05_27.py with 70 tests (68 passing, 2 skipped). Closes Cat 3 HIGH Nx1 single-column gap across numpy/cupy/dask+numpy/dask+cupy x 8 methods (nearest/bilinear/cubic/average/min/max/median/mode) plus Nx1 upsample-nearest parity and Nx1 cross-backend aggregate parity. Closes Cat 2 MEDIUM NaN-parity gap on cupy and dask+cupy (existing TestCuPyParity/TestDaskCuPyParity used random data without NaN; the weight-mask gate and spline-prepad had no GPU NaN coverage). Closes Cat 3 MEDIUM all-equal-value raster across 8 methods (downsample) and 3 interp methods (upsample) plus a constant-with-NaN aggregate variant. Closes Cat 5 MEDIUM non-default dim-name propagation: lat/lon, latitude/longitude, and (channel, lat, lon) 3D round-trip without being renamed to y/x; per-dim attrs (units) preserved. Closes Cat 3 MEDIUM empty-raster behaviour pin: 0-row and 0-col rasters raise (currently IndexError) -- contract covered. Filed source-bug issue #2547: cubic on dask backends fails for Nx1 / arrays smaller than depth=16; the 2 skipped tests in this file gate on that fix landing. Source untouched." slope,2026-05-29,2697,MEDIUM,3,"PR #2703: added degenerate-shape tests (1x1/1xN/Nx1) for all 4 planar backends + geodesic; no live bug, pins all-NaN+shape contract. CUDA host: cupy/dask+cupy ran. Backend/NaN/param/metadata coverage already complete." viewshed,2026-05-29,2693,HIGH,1;2;5,"Pass 1 (2026-05-29): added 4 new test groups to test_viewshed.py (13 new tests + 1 xfail, all passing/xfailing on a CUDA+RTX host). Closes Cat 1 HIGH backend-coverage gap: the dask+cupy dispatch path in _viewshed_dask (Tier B) and _viewshed_windowed (max_distance) was registered but never invoked by any test -- added test_viewshed_dask_cupy_flat (analytical-angle parity, atol 0.03) and test_viewshed_dask_cupy_max_distance (windowed GPU run; observer cell 180, corners INVISIBLE). Both use non-zero flat terrain (1.3) because the RTX mesh builder rejects an all-zero raster (#1378). Closes Cat 5 HIGH metadata-preservation gap: only the numpy test_viewshed called general_output_checks; the cupy/dask/dask+cupy and max_distance paths never asserted attrs/coords/dims/array-type preservation. Added parametrised test_viewshed_metadata_preserved over {numpy,cupy,dask+numpy,dask+cupy} x {full, max_distance=2.0}: asserts attrs==, dims==, shape==, x/y coords allclose; runs general_output_checks (full type parity) for all backends except dask+cupy. Closes Cat 2 HIGH NaN-input gap and surfaced source bug #2693: viewshed on a numpy raster crashes with ValueError 'node not found' from _delete_from_tree when a NaN cell sits at certain positions (e.g. (2,4) in a 5x5 with observer at (2,2)), while NaN at (1,1)/(0,0)/(4,4) runs fine. Added test_viewshed_nan_input_supported_positions (parametrised working positions, asserts observer=180 and NaN cell is INVISIBLE/NaN) plus test_viewshed_nan_input_crashing_position (xfail strict, raises, links #2693). Noted but NOT fixed (source change out of scope for test sweep): the dask+cupy backend does not preserve the cupy backing -- _viewshed_dask computes then rewraps via da.from_array(result_np), so the output computes to numpy not cupy; general_output_checks is skipped for dask+cupy for that reason (candidate for the metadata/backend-parity sweep). LOW (documented only): non-square cell sizes; 1x1 and 1xN geometry covered behaviourally by probing (run without error). Test-only PR; viewshed.py untouched." diff --git a/xrspatial/tests/test_reproject_streaming_3101.py b/xrspatial/tests/test_reproject_streaming_3101.py new file mode 100644 index 000000000..d837b1492 --- /dev/null +++ b/xrspatial/tests/test_reproject_streaming_3101.py @@ -0,0 +1,185 @@ +"""Tests for the streaming fallback path of reproject() (issue #3101). + +reproject() falls back to ``_reproject_streaming`` when the source is +larger than 512 MB and dask is not importable. That branch never runs in +CI organically (dask is installed and test rasters are small), so these +tests call the helpers directly with small inputs and compare against the +in-memory numpy path on the same output grid. +""" +from __future__ import annotations + +import importlib + +import numpy as np +import pytest +import xarray as xr + +try: + import pyproj # noqa: F401 + HAS_PYPROJ = True +except ImportError: + HAS_PYPROJ = False + +pytestmark = pytest.mark.skipif( + not HAS_PYPROJ, reason="pyproj required for reproject tests" +) + +# ``from xrspatial import reproject`` resolves to the function, not the +# package, so load the package module explicitly. +_reproject_mod = importlib.import_module('xrspatial.reproject') + + +def _make_raster(data, crs='EPSG:4326', x_range=(-5, 5), y_range=(45, 55)): + h, w = data.shape[:2] + y = np.linspace(y_range[1], y_range[0], h) # north-up (descending) + x = np.linspace(x_range[0], x_range[1], w) + dims = ['y', 'x'] if data.ndim == 2 else ['y', 'x', 'band'] + coords = {'y': y, 'x': x} + if data.ndim == 3: + coords['band'] = np.arange(data.shape[2]) + 1 + return xr.DataArray( + data, dims=dims, coords=coords, + attrs={'crs': crs, 'nodata': np.nan}, + ) + + +def _streaming_setup(raster, target_crs='EPSG:32633'): + """Resolve the source/target geometry the way reproject() does.""" + from xrspatial.reproject._crs_utils import _resolve_crs + from xrspatial.reproject._grid import _compute_output_grid + + src_crs = _resolve_crs(raster.attrs['crs']) + tgt_crs = _resolve_crs(target_crs) + src_bounds = _reproject_mod._source_bounds(raster) + ydim, xdim = _reproject_mod._find_spatial_dims(raster) + src_shape = (raster.sizes[ydim], raster.sizes[xdim]) + grid = _compute_output_grid(src_bounds, src_shape, src_crs, tgt_crs) + return src_crs, tgt_crs, src_bounds, src_shape, grid + + +def _run_streaming(raster, tile_size, max_memory, resampling='bilinear', + precision=16): + src_crs, tgt_crs, src_bounds, src_shape, grid = _streaming_setup(raster) + return _reproject_mod._reproject_streaming( + raster, src_bounds, src_shape, True, + src_crs.to_wkt(), tgt_crs.to_wkt(), + grid['bounds'], grid['shape'], + resampling, float('nan'), precision, + tile_size, _reproject_mod._parse_max_memory(max_memory), + x_desc=False, band_nodata=None, + ) + + +class TestParseMaxMemory: + """_parse_max_memory accepts ints and human-readable strings.""" + + @pytest.mark.parametrize('value,expected', [ + (None, 1024 ** 3), # default: 1 GB + (12345, 12345), # int passthrough + (12345.6, 12345), # float truncates to int + ('256KB', 256 * 1024), + ('512MB', 512 * 1024 ** 2), + ('4GB', 4 * 1024 ** 3), + ('2TB', 2 * 1024 ** 4), + ('1.5GB', int(1.5 * 1024 ** 3)), + (' 1gb ', 1024 ** 3), # whitespace + lowercase + ('123', 123), # bare numeric string + ]) + def test_parse(self, value, expected): + assert _reproject_mod._parse_max_memory(value) == expected + + +class TestStreamingMatchesInMemory: + """The streaming tile assembly must reproduce the in-memory result.""" + + def _reference(self, raster): + from xrspatial.reproject import reproject + return reproject(raster, 'EPSG:32633') + + @staticmethod + def _max_concurrent(max_memory, tile_size): + # Mirror of the budget arithmetic in _reproject_streaming / + # _process_tile_batch so the tests can assert which batching + # branch they engage. If the tile_mem formula changes upstream, + # these assertions flag the re-routed branch instead of both + # tests silently exercising the same one. + tile_mem = tile_size * tile_size * 8 * 4 + budget = _reproject_mod._parse_max_memory(max_memory) + return max(1, budget // max(tile_mem, 1)) + + def test_multi_tile_threaded(self): + # 64x64 output split into 32x32 tiles; 1 GB budget keeps the + # ThreadPoolExecutor branch active (max_concurrent >= 2). + # + # transform_precision=0 forces the exact pyproj path (one + # Transformer per worker thread). The numba fast-path kernels are + # parallel=True, and running them concurrently from this thread + # pool aborts the interpreter on numba's workqueue threading + # layer (macOS arm64 CI) -- that source bug is #3141. The numba + # path stays covered by the serial tests below. + assert self._max_concurrent('1GB', 32) >= 2 + data = np.random.RandomState(42).rand(64, 64) + raster = _make_raster(data) + from xrspatial.reproject import reproject + ref = reproject(raster, 'EPSG:32633', transform_precision=0) + out = _run_streaming(raster, tile_size=32, max_memory='1GB', + precision=0) + assert out.shape == ref.shape + # More than one tile, or the thread pool never gets a batch. + assert ref.shape[0] > 32 or ref.shape[1] > 32 + np.testing.assert_allclose(out, ref.values, equal_nan=True) + + def test_multi_tile_serial(self): + # max_memory=1 byte forces max_concurrent == 1, taking the serial + # per-job loop instead of the thread pool. + assert self._max_concurrent(1, 32) == 1 + data = np.random.RandomState(7).rand(64, 64) + raster = _make_raster(data) + ref = self._reference(raster) + out = _run_streaming(raster, tile_size=32, max_memory=1) + assert out.shape == ref.shape + np.testing.assert_allclose(out, ref.values, equal_nan=True) + + def test_single_tile(self): + # tile_size larger than the output grid: one job covers everything. + data = np.random.RandomState(3).rand(32, 32) + raster = _make_raster(data) + ref = self._reference(raster) + out = _run_streaming(raster, tile_size=4096, max_memory='1GB') + assert out.shape == ref.shape + np.testing.assert_allclose(out, ref.values, equal_nan=True) + + def test_nearest_resampling_with_nan(self): + # NaN cells survive the streaming assembly the same way they do + # in-memory. max_memory=1 keeps this multi-tile run on the serial + # loop so the parallel=True numba kernels are never entered + # concurrently (#3141) while still covering the numba fast path. + from xrspatial.reproject import reproject + data = np.random.RandomState(11).rand(48, 48) + data[10:14, 20:24] = np.nan + raster = _make_raster(data) + ref = reproject(raster, 'EPSG:32633', resampling='nearest') + out = _run_streaming(raster, tile_size=20, max_memory=1, + resampling='nearest') + assert out.shape == ref.shape + np.testing.assert_allclose(out, ref.values, equal_nan=True) + + +class TestStreaming3D: + """3-D sources crash in the streaming assembly (issue #3100).""" + + @pytest.mark.xfail(strict=True, raises=ValueError, + reason="2-D output buffer vs 3-D tiles, see #3100") + def test_3d_source_streams(self): + from xrspatial.reproject import reproject + data = np.random.RandomState(5).rand(32, 32, 3) + raster = _make_raster(data) + # max_memory=1 -> serial loop, so the expected failure is the + # deterministic assembly ValueError, never the concurrent + # parallel-kernel abort from #3141. + out = _run_streaming(raster, tile_size=16, max_memory=1) + # Once #3100 is fixed the xfail comes off and the streaming + # result must match the in-memory 3-D path: + ref = reproject(raster, 'EPSG:32633') + assert out.shape == ref.shape + np.testing.assert_allclose(out, ref.values, equal_nan=True)