Skip to content

Make shapely an optional [vector] extra (#2496)#2497

Merged
brendancol merged 2 commits into
mainfrom
issue-2496
May 27, 2026
Merged

Make shapely an optional [vector] extra (#2496)#2497
brendancol merged 2 commits into
mainfrom
issue-2496

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2496

shapely (and GEOS) loaded on every import xrspatial because rasterize.py
imported it at module top level, even though only rasterize and polygonize
use it. This moves it to an optional vector extra, matching the matplotlib
change in #2494.

  • Replace the top-level import shapely in rasterize.py with a cached
    _require_shapely() helper. Each function that uses the shapely array API
    binds the module locally, so dask tile workers (which call the tile helpers
    directly, not through rasterize) also get the clear error. rasterize()
    calls it up front so a missing install fails early with a message pointing at
    pip install xarray-spatial[vector].
  • Remove shapely>=2.0 from install_requires; add a vector extra and keep
    shapely in the tests extra so CI still exercises the vector paths.
  • polygonize already imports shapely lazily, and only on its geopandas
    return path (which needs geopandas anyway), so it needs no change.
  • Update README, install docs, and CHANGELOG.

This is a packaging change, not backend-specific. The rasterize backends
(numpy / cupy / dask+numpy / dask+cupy) are unchanged; shapely is only used for
geometry classification and coordinate extraction, which feed all backends.

Test plan

  • New test_optional_shapely.py (3 passed): import xrspatial works in a
    subprocess with shapely blocked; _require_shapely() and rasterize() raise
    the friendly error pointing at the vector extra.
  • pytest xrspatial/tests/test_rasterize.py xrspatial/tests/test_polygonize.py
    (356 passed, 15 skipped) with shapely installed.
  • Confirmed import xrspatial no longer loads shapely (shapely absent
    from sys.modules) and a real rasterize still produces correct output.

shapely (and GEOS) loaded on every import xrspatial because rasterize.py
imported it at module top level. Import it lazily via a cached
_require_shapely() helper, bound locally in each function that uses the
shapely array API (so dask tile workers, which call the helpers directly,
also get the friendly error). rasterize() calls it up front for an early,
clear failure.

Remove shapely>=2.0 from install_requires; add a vector extra and keep
shapely in the tests extra. polygonize already imports shapely lazily and
only on its geopandas return path, so it needs no change. Update README,
install docs, and CHANGELOG.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 27, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Make shapely an optional [vector] extra (#2496)

Packaging change plus lazy-import plumbing. The rasterize algorithm and its
four backends are untouched, so the numeric/backend checklist mostly doesn't
apply here. I focused on whether every shapely use path is actually guarded.

Blockers

None.

Suggestions

None blocking.

Nits

  • polygonize.py still does a bare import shapely inside _to_geopandas
    (around line 609). Without shapely that raises a plain ImportError rather than
    the [vector]-pointing message. In practice geopandas is imported first on
    that same path and geopandas requires shapely, so the user hits geopandas's
    own error first -- but the messaging is slightly inconsistent with the new
    rasterize behavior. Optional to route it through _require_shapely for a
    uniform message.
  • test_optional_shapely.py::test_import_xrspatial_without_shapely imports
    xrspatial.rasterize in the subprocess but not xrspatial.polygonize. Adding
    polygonize to that import list would lock in that it also imports clean
    without shapely.

What looks good

  • Every bare shapely. reference (13 of them) sits in a function that binds
    shapely = _require_shapely() first; rasterize() calls the guard up front
    for an early, clear failure. Verified each call site against its enclosing
    function.
  • The dask path is handled correctly: tile workers call _polys_from_wkb /
    _classify_geometries directly, not through rasterize(), and those helpers
    bind via _require_shapely() so the worker either imports shapely or raises
    the friendly error. The cache lives in a module global, so each worker
    process resolves it once.
  • Confirmed import xrspatial no longer loads shapely (absent from
    sys.modules), and a real rasterize still produces correct output.
  • shapely kept in the tests extra, so CI still exercises the vector paths;
    356 rasterize/polygonize tests pass.
  • New tests cover the subprocess import, the helper message, and the up-front
    rasterize() error, and they reset the _shapely cache so the in-process
    checks are real.

Checklist

  • Every shapely use path guarded (rasterize + dask helpers)
  • import xrspatial no longer loads shapely
  • Friendly ImportError points at the vector extra
  • tests extra keeps shapely so CI exercises vector paths
  • Docs (README, installation.rst) and CHANGELOG updated
  • polygonize geopandas path uses the same [vector] message (nit)
  • No backend/algorithm change; existing rasterize tests pass

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (after 7047f03)

Disposition of the two nits from the first pass:

  • Nit 2 (subprocess import test): fixed. test_import_xrspatial_without_shapely
    now also imports xrspatial.polygonize with shapely blocked, locking in that
    it imports clean too. 3 passed.
  • Nit 1 (route polygonize's geopandas path through _require_shapely):
    dismissed. On that path import geopandas runs first, and geopandas
    hard-requires shapely, so a "geopandas installed but shapely missing" state is
    not reachable. Adding a cross-module import for a message that can't surface
    isn't worth it.

No remaining items beyond CI.

@brendancol brendancol merged commit 0a51f50 into main May 27, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move shapely to an optional [vector] extra so it isn't loaded on import

1 participant