Raise clear ValueError for empty Dataset in stats() by brendancol · Pull Request #2642 · xarray-contrib/xarray-spatial

brendancol · 2026-05-29T19:34:39Z

stats() accepts an xarray Dataset and runs per-variable, then merges the results. If the Dataset has no data variables, the merge hit result = dfs[0] on an empty list and raised an opaque IndexError that told the caller nothing.

This adds an early check in the Dataset branch: when values.data_vars is empty, raise a ValueError saying there is nothing to compute statistics over, before reaching the dfs[0] access.

Guard the empty-Dataset case in the stats() Dataset branch with a clear ValueError.
Add a regression test that passes an empty Dataset and asserts ValueError.

Backend coverage: the Dataset branch is shared by all backends and the guard runs before any backend dispatch, so numpy / cupy / dask+numpy / dask+cupy are all covered.

Test plan:

New test test_stats_empty_dataset_raises_value_error_2637 passes
Existing Dataset return_type test still passes

Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.

brendancol

PR Review: Raise clear ValueError for empty Dataset in stats()

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

xrspatial/zonal.py:861 -- len(values.data_vars) == 0 works and is clear. if not values.data_vars: would be slightly more idiomatic since the data_vars mapping is falsy when empty, but the explicit length check reads fine and matches the surrounding style. Take it or leave it.

What looks good

The guard is in the right place: after the return_type check and before the dfs[0] access, so the empty case is caught early with a message that names the actual problem (no data variables).
The error message tells the caller what to do (pass a Dataset with at least one data variable), not just what went wrong.
The regression test reuses the existing small_zones_values_2558 fixture and asserts on the "no data variables" text, so the old IndexError would not satisfy it.
Scope is tight: two files, no unrelated edits.

Checklist

Algorithm matches reference/paper: n/a (input validation fix)
All implemented backends produce consistent results: yes, guard runs before backend dispatch
NaN handling is correct: n/a
Edge cases are covered by tests: yes, empty Dataset is the case in question
Dask chunk boundaries handled correctly: n/a
No premature materialization or unnecessary copies: yes, the data_vars length check is cheap
Benchmark exists or is not needed: not needed
README feature matrix updated: not needed, no new public API
Docstrings present and accurate: stats() docstring unchanged, still accurate

# Conflicts: # xrspatial/tests/test_zonal.py

brendancol added 2 commits May 29, 2026 08:59

Raise clear ValueError for empty Dataset in stats() (#2637)

32efc57

github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026

brendancol commented May 29, 2026

View reviewed changes

brendancol added 2 commits May 29, 2026 12:36

Merge remote-tracking branch 'origin/main' into issue-2637

09f7c89

# Conflicts: # xrspatial/tests/test_zonal.py

Drop unrelated sweep-test-coverage-state.csv from this PR (#2637)

89d88b7

brendancol merged commit cbae648 into main May 29, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise clear ValueError for empty Dataset in stats()#2642

Raise clear ValueError for empty Dataset in stats()#2642
brendancol merged 4 commits into
mainfrom
issue-2637

brendancol commented May 29, 2026

Uh oh!

brendancol left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented May 29, 2026

Uh oh!

brendancol left a comment

Choose a reason for hiding this comment

PR Review: Raise clear ValueError for empty Dataset in stats()

Blockers (must fix before merge)

Suggestions (should fix, not blocking)

Nits (optional improvements)

What looks good

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant