Skip to content

reproject: numpy chunk worker runs try_numba_transform twice per chunk on pyproj-fallback CRS pairs #3106

@brendancol

Description

@brendancol

Problem

_reproject_chunk_numpy (xrspatial/reproject/init.py) calls try_numba_transform to check for a numba fast path. When the CRS pair has no fast path the call returns None and the worker falls back to _transform_coords, which calls try_numba_transform again before building the pyproj control grid. Both calls do the same work and both return None.

Each wasted call re-parses CRS parameters (around ten pyproj to_dict() / to_authority() round-trips across the per-projection param extractors) and allocates four chunk-sized float64 arrays (np.tile, np.repeat, two np.empty) before concluding there is no fast path.

Measured on a 512x512 chunk, EPSG:4326 -> ESRI:54009 (Mollweide, no fast path):

  • wasted try_numba_transform call: ~0.3-0.5 ms
  • _transform_coords with the retry: 1.57 ms; without: 1.06 ms
  • full chunk worker: 5.3 ms, of which ~11% is the duplicated call

The waste repeats for every output chunk on the dask+numpy path, and merge's per-block adapter routes through the same worker.

Fix

In _reproject_chunk_numpy, pass src_crs=None, tgt_crs=None to _transform_coords. The inner retry is gated on both being non-None, so this skips it. The cupy workers (_reproject_chunk_cupy, _reproject_dask_cupy CPU fallback) must keep passing the CRS objects: they only tried the CUDA path, so the retry inside _transform_coords is their first and only numba attempt.

Found by the performance sweep audit of xrspatial/reproject. OOM verdict for the module: SAFE; bottleneck: compute-bound.

Metadata

Metadata

Assignees

No one assigned

    Labels

    daskDask backend / chunked arraysenhancementNew feature or requestperformancePR touches performance-sensitive codeseverity:mediumSweep finding: MEDIUMsweep-performanceFound by /sweep-performance

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions