Remove flaky wall-clock VW small-input regression test by brendancol · Pull Request #2914 · xarray-contrib/xarray-spatial

brendancol · 2026-06-04T14:32:07Z

Problem

The pytest job on main is failing (run 26927364653). test_visvalingam_whyatt_no_regression_small_inputs asserts the heap-based Visvalingam-Whyatt implementation runs less than 3x slower than the O(n^2) baseline, measured with time.perf_counter().

For the small inputs it tests (n up to 500), both implementations finish in microseconds. In the failing run heap took ~11us and O(n^2) took ~3us, so the ratio is mostly scheduler noise on shared runners. On macos-latest / 3.13 it measured 3.34x and tripped the < 3.0 assertion:

AssertionError: Heap is 3.34x slower than O(n^2) for n=100, expected < 3x (heap=0.000011s, o2=0.000003s)

Fix

Remove the test. Its only unique assertion is the wall-clock ratio, which can't be measured reliably at microsecond scale on shared CI runners.

The large-input companion test test_visvalingam_whyatt_completes_large_inputs already dropped its wall-clock assertion for the same reason (Flaky CI: test_visvalingam_whyatt_scales_subquadratic fails on loaded runners (wall-clock timing assertion) #2888).
Correctness for these exact small sizes (5, 10, 50, 100, 500) is already covered by test_visvalingam_whyatt_heap_correctness, which compares heap output against the O(n^2) baseline. No coverage is lost.

Verification

TestSimplifyHelpers passes locally (17 tests).

test_visvalingam_whyatt_no_regression_small_inputs asserted the heap Visvalingam-Whyatt implementation runs less than 3x slower than the O(n^2) baseline, measured with time.perf_counter(). At n<=500 both finish in microseconds (heap=11us, o2=3us in the failing run), so the ratio is mostly scheduler noise on shared CI runners. It measured 3.34x on macos-latest 3.13 and failed the main pytest job. The large-input companion test (#2888) already dropped its wall-clock assertion for this reason. Correctness for these exact small sizes is covered by test_visvalingam_whyatt_heap_correctness, so removing the timing test loses no coverage.

github-actions Bot added the performance PR touches performance-sensitive code label Jun 4, 2026

brendancol merged commit 5f8c5ce into main Jun 4, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove flaky wall-clock VW small-input regression test#2914

Remove flaky wall-clock VW small-input regression test#2914
brendancol merged 1 commit into
mainfrom
worktree-fix-vw-timing-flake

brendancol commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Jun 4, 2026

Problem

Fix

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant