Raise clear ValueError for empty Dataset in stats()#2642
Merged
Conversation
Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.
brendancol
commented
May 29, 2026
brendancol
left a comment
Contributor
Author
There was a problem hiding this comment.
PR Review: Raise clear ValueError for empty Dataset in stats()
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- xrspatial/zonal.py:861 --
len(values.data_vars) == 0works and is clear.if not values.data_vars:would be slightly more idiomatic since the data_vars mapping is falsy when empty, but the explicit length check reads fine and matches the surrounding style. Take it or leave it.
What looks good
- The guard is in the right place: after the return_type check and before the
dfs[0]access, so the empty case is caught early with a message that names the actual problem (no data variables). - The error message tells the caller what to do (pass a Dataset with at least one data variable), not just what went wrong.
- The regression test reuses the existing
small_zones_values_2558fixture and asserts on the "no data variables" text, so the old IndexError would not satisfy it. - Scope is tight: two files, no unrelated edits.
Checklist
- Algorithm matches reference/paper: n/a (input validation fix)
- All implemented backends produce consistent results: yes, guard runs before backend dispatch
- NaN handling is correct: n/a
- Edge cases are covered by tests: yes, empty Dataset is the case in question
- Dask chunk boundaries handled correctly: n/a
- No premature materialization or unnecessary copies: yes, the data_vars length check is cheap
- Benchmark exists or is not needed: not needed
- README feature matrix updated: not needed, no new public API
- Docstrings present and accurate: stats() docstring unchanged, still accurate
# Conflicts: # xrspatial/tests/test_zonal.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2637
stats()accepts an xarray Dataset and runs per-variable, then merges the results. If the Dataset has no data variables, the merge hitresult = dfs[0]on an empty list and raised an opaqueIndexErrorthat told the caller nothing.This adds an early check in the Dataset branch: when
values.data_varsis empty, raise aValueErrorsaying there is nothing to compute statistics over, before reaching thedfs[0]access.stats()Dataset branch with a clearValueError.ValueError.Backend coverage: the Dataset branch is shared by all backends and the guard runs before any backend dispatch, so numpy / cupy / dask+numpy / dask+cupy are all covered.
Test plan:
test_stats_empty_dataset_raises_value_error_2637passes