Skip to content

Remove flaky wall-clock VW small-input regression test#2914

Merged
brendancol merged 1 commit into
mainfrom
worktree-fix-vw-timing-flake
Jun 4, 2026
Merged

Remove flaky wall-clock VW small-input regression test#2914
brendancol merged 1 commit into
mainfrom
worktree-fix-vw-timing-flake

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Problem

The pytest job on main is failing (run 26927364653). test_visvalingam_whyatt_no_regression_small_inputs asserts the heap-based Visvalingam-Whyatt implementation runs less than 3x slower than the O(n^2) baseline, measured with time.perf_counter().

For the small inputs it tests (n up to 500), both implementations finish in microseconds. In the failing run heap took ~11us and O(n^2) took ~3us, so the ratio is mostly scheduler noise on shared runners. On macos-latest / 3.13 it measured 3.34x and tripped the < 3.0 assertion:

AssertionError: Heap is 3.34x slower than O(n^2) for n=100, expected < 3x (heap=0.000011s, o2=0.000003s)

Fix

Remove the test. Its only unique assertion is the wall-clock ratio, which can't be measured reliably at microsecond scale on shared CI runners.

Verification

TestSimplifyHelpers passes locally (17 tests).

test_visvalingam_whyatt_no_regression_small_inputs asserted the heap
Visvalingam-Whyatt implementation runs less than 3x slower than the
O(n^2) baseline, measured with time.perf_counter(). At n<=500 both
finish in microseconds (heap=11us, o2=3us in the failing run), so the
ratio is mostly scheduler noise on shared CI runners. It measured 3.34x
on macos-latest 3.13 and failed the main pytest job.

The large-input companion test (#2888) already dropped its wall-clock
assertion for this reason. Correctness for these exact small sizes is
covered by test_visvalingam_whyatt_heap_correctness, so removing the
timing test loses no coverage.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 4, 2026
@brendancol brendancol merged commit 5f8c5ce into main Jun 4, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant