Remove flaky wall-clock VW small-input regression test#2914
Merged
Conversation
test_visvalingam_whyatt_no_regression_small_inputs asserted the heap Visvalingam-Whyatt implementation runs less than 3x slower than the O(n^2) baseline, measured with time.perf_counter(). At n<=500 both finish in microseconds (heap=11us, o2=3us in the failing run), so the ratio is mostly scheduler noise on shared CI runners. It measured 3.34x on macos-latest 3.13 and failed the main pytest job. The large-input companion test (#2888) already dropped its wall-clock assertion for this reason. Correctness for these exact small sizes is covered by test_visvalingam_whyatt_heap_correctness, so removing the timing test loses no coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
pytestjob onmainis failing (run 26927364653).test_visvalingam_whyatt_no_regression_small_inputsasserts the heap-based Visvalingam-Whyatt implementation runs less than 3x slower than the O(n^2) baseline, measured withtime.perf_counter().For the small inputs it tests (n up to 500), both implementations finish in microseconds. In the failing run heap took ~11us and O(n^2) took ~3us, so the ratio is mostly scheduler noise on shared runners. On macos-latest / 3.13 it measured 3.34x and tripped the
< 3.0assertion:Fix
Remove the test. Its only unique assertion is the wall-clock ratio, which can't be measured reliably at microsecond scale on shared CI runners.
test_visvalingam_whyatt_completes_large_inputsalready dropped its wall-clock assertion for the same reason (Flaky CI: test_visvalingam_whyatt_scales_subquadratic fails on loaded runners (wall-clock timing assertion) #2888).test_visvalingam_whyatt_heap_correctness, which compares heap output against the O(n^2) baseline. No coverage is lost.Verification
TestSimplifyHelperspasses locally (17 tests).