Skip to content

Hoist proximity line-sweep kernel to module level to stop per-call recompiles (#3103)#3126

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-performance-proximity-2026-06-09-01
Jun 10, 2026
Merged

Hoist proximity line-sweep kernel to module level to stop per-call recompiles (#3103)#3126
brendancol merged 2 commits into
mainfrom
deep-sweep-performance-proximity-2026-06-09-01

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3103

  • Hoists _process_numpy_linesweep from a closure inside _process to module level. The former closure variables (target_values, max_distance, distance_metric, process_mode) are now arguments, so numba compiles once per signature and reuses it across calls instead of recompiling the kernel on every proximity() call.
  • The kernel body moved verbatim (dedented); the only code change is the signature and the call site in _process_numpy.
  • Adds a regression test that repeated proximity() calls don't grow the dispatcher's signature list.

Warm-call timings on the numpy backend: 10x10 raster 0.44s -> ~1ms, 1000x1000 raster 0.49s -> ~35ms. The ~0.42s constant was recompilation, not compute.

Backend coverage: numpy and dask+numpy are the affected paths (the line-sweep only runs in PROXIMITY mode with EUCLIDEAN/MANHATTAN). cupy and dask+cupy are untouched but their parity tests were run.

Test plan:

  • pytest xrspatial/tests/test_proximity.py — 411 passed on a CUDA host (cupy and dask+cupy parity tests included)
  • New test test_proximity_linesweep_compiled_once guards the module-level dispatcher

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Blockers

None.

Suggestions

None.

Nits

  • xrspatial/tests/test_proximity.py (new test at end of file): the regression test relies on numba's Dispatcher.signatures attribute. It has been stable for years, but if numba ever drops it the test will fail with an AttributeError rather than a clear message. Fine to leave as is; noting it for the record.

What was checked

  • Extracted the old closure body from HEAD~1, dedented it, and diffed against the new module-level kernel: exact match. The only code changes are the signature, the _process_numpy call site, and one comment.
  • The former closure variables became arguments with the same runtime values (np.float64(max_distance) matches the captured float; target_values is the same np.asarray result), so numerical behavior is unchanged.
  • All 411 proximity tests pass on a CUDA host, covering numpy/cupy/dask+numpy/dask+cupy parity, tie-breaking, and metric validation.
  • Measured effect matches the issue: warm 10x10 call 0.44s -> ~1ms, 1000x1000 0.49s -> ~35ms, and the dispatcher holds a single signature across repeated calls.
  • The regression test pins the fix structurally (signature-count stability) instead of a flaky wall-clock bound.
  • Benchmark already exists at benchmarks/benchmarks/proximity.py; no API or backend-support change, so README and docs need no update.

@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 9, 2026
…e-proximity-2026-06-09-01

# Conflicts:
#	xrspatial/tests/test_proximity.py
@brendancol brendancol merged commit 9678e04 into main Jun 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

proximity: line-sweep kernel recompiles on every call (~0.4s overhead)

1 participant