Skip to content

Replace random.randint port picks with OS-assigned ephemeral ports#703

Merged
gijzelaerr merged 1 commit intomasterfrom
fix-flaky-port-collisions
Apr 21, 2026
Merged

Replace random.randint port picks with OS-assigned ephemeral ports#703
gijzelaerr merged 1 commit intomasterfrom
fix-flaky-port-collisions

Conversation

@gijzelaerr
Copy link
Copy Markdown
Owner

Summary

Fixes the Errno 98: Address already in use flake that bit test_server_context_manager in the post-merge CI run for #701. Root cause: test fixtures picked TCP ports via random.randint over narrow ranges (5k-20k ports wide) with no free-check. Under concurrent server starts or rapid re-runs the draw collides.

Fix: bind a throwaway socket to port 0, read the OS-assigned ephemeral port, close it, and pass the number to the test server. The ephemeral pool is tens of thousands wide and the OS guarantees the port was free at pick time — much smaller collision window than the previous random draws. snap7.server already sets SO_REUSEADDR, so TIME_WAIT lingers from a prior run don't bite either.

One helper (get_free_tcp_port) lives in tests/conftest.py and is reused by test_s7_unified, test_optimizer, and test_stress. Added an empty tests/__init__.py so from .conftest import get_free_tcp_port works (the directory was already treated as a package by pytest; this just makes the relative import legal).

Test plan

  • Full suite: 1501 passed, 91 skipped
  • test_server_context_manager run 10 times in a row — 10/10 pass
  • mypy + ruff + pre-commit clean

🤖 Generated with Claude Code

The post-merge CI for #701 hit Errno 98 ("Address already in use") on
ubuntu-24.04 / py3.14 inside test_server_context_manager. Root cause:
tests picked ports via random.randint over narrow ranges (5k-20k
ports) with no collision check, so two concurrent server starts could
grab the same port, and re-runs on the same runner hit TIME_WAIT
lingers.

Fix: bind a throwaway socket to port 0, read the OS-assigned
ephemeral port, close, and pass it to the server. The ephemeral pool
is tens of thousands wide and the OS guarantees the port is free at
the moment of pick — much smaller collision window than a 1-in-5000
random draw. snap7.server already sets SO_REUSEADDR so TIME_WAIT
lingers don't bite either.

One helper (`get_free_tcp_port`) lives in tests/conftest.py and is
reused by test_s7_unified, test_optimizer, and test_stress. Added
tests/__init__.py so `from .conftest import ...` works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gijzelaerr gijzelaerr merged commit 8125f93 into master Apr 21, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant