Skip to content

test: Fix flaky test_autoscales by replacing fixed sleeps with poll-based assertions#1835

Merged
vdusek merged 4 commits intoapify:masterfrom
Lidang-Jiang:fix/flaky-test-autoscales
Apr 14, 2026
Merged

test: Fix flaky test_autoscales by replacing fixed sleeps with poll-based assertions#1835
vdusek merged 4 commits intoapify:masterfrom
Lidang-Jiang:fix/flaky-test-autoscales

Conversation

@Lidang-Jiang
Copy link
Copy Markdown
Contributor

@Lidang-Jiang Lidang-Jiang commented Apr 4, 2026

Summary

Fixes #1655test_autoscales is flaky on Windows and macOS.

The root cause is that the test relies on fixed asyncio.sleep() durations to assert autoscaling behavior, but event loop scheduling jitter on Windows/macOS can cause the autoscaler to not complete enough cycles within the expected time window (e.g., desired_concurrency only reaches 3 instead of 4).

Changes:

  • Add a _wait_for() helper that polls a condition until it becomes true (with configurable timeout)
  • Replace all asyncio.sleep() + assert patterns with _wait_for() calls
  • Use an explicit overload_active flag instead of wall-clock datetime comparison for the CPU overload simulation
  • Remove @pytest.mark.flaky decorator since the test is now deterministic
  • Remove unused datetime/timezone imports
Before (original test — flaky on Windows/macOS)
# On Linux the test usually passes, but on Windows/macOS it can fail:
FAILED tests/unit/_autoscaling/test_autoscaled_pool.py::test_autoscales - assert 3 == 4
 +  where 3 = <AutoscaledPool>.desired_concurrency

The original test uses fixed sleep durations which are too tight for slower event loop schedulers:

await asyncio.sleep(0.3)
assert pool.desired_concurrency == 4  # Fails when autoscaler hasn't completed enough cycles
After (fixed test — 5/5 passes, all 8 tests pass)
$ python -m pytest tests/unit/_autoscaling/test_autoscaled_pool.py -v --timeout=60
============================= test session starts ==============================
platform linux -- Python 3.12.12, pytest-9.0.2, pluggy-1.6.0

tests/unit/_autoscaling/test_autoscaled_pool.py::test_runs_concurrently PASSED [ 12%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_abort_works PASSED [ 25%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_propagates_exceptions PASSED [ 37%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_propagates_exceptions_after_finished PASSED [ 50%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_autoscales PASSED  [ 62%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_autoscales_uses_desired_concurrency_ratio PASSED [ 75%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_max_tasks_per_minute_works PASSED [ 87%]
tests/unit/_autoscaling/test_autoscaled_pool.py::test_allows_multiple_run_calls PASSED [100%]

============================== 8 passed in 3.26s ===============================

Stability verification (5 consecutive runs):

=== Run 1 === PASSED
=== Run 2 === PASSED
=== Run 3 === PASSED
=== Run 4 === PASSED
=== Run 5 === PASSED

Test plan

  • test_autoscales passes consistently (5/5 runs)
  • All 8 tests in test_autoscaled_pool.py pass with no regressions
  • ruff check passes
  • CI passes on Windows/macOS runners

Lidang-Jiang and others added 4 commits April 4, 2026 11:49
…ased assertions

Replace time-dependent assertions with a polling helper that waits for
conditions to become true within a generous timeout. This eliminates
flakiness caused by event loop scheduling jitter on Windows and macOS,
where the autoscaler may not complete enough cycles within the original
hardcoded sleep intervals.

Changes:
- Add _wait_for() helper that polls a condition with configurable timeout
- Replace all asyncio.sleep() + assert patterns with _wait_for() calls
- Use an explicit flag for CPU overload instead of wall-clock comparison
- Remove @pytest.mark.flaky decorator since the test is now deterministic
- Remove unused datetime/timezone imports

Fixes apify#1655
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oop()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vdusek vdusek self-assigned this Apr 13, 2026
@vdusek vdusek added the t-tooling Issues with this label are in the ownership of the tooling team. label Apr 13, 2026
@vdusek vdusek requested review from Pijukatel and vdusek April 13, 2026 11:23
Copy link
Copy Markdown
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Lidang-Jiang, thanks for your contribution. I just moved the wait helper to the test utils module. Otherwise LGTM.

@vdusek vdusek merged commit 58156ca into apify:master Apr 14, 2026
58 of 59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test test_autoscales is flaky

4 participants