In lazy-load mode, workers on a retry run can end up with resolve_entry returning an UnresolvedEntry instead of a live test object, crashing the test runner:
NoMethodError: undefined method `run' for an instance of CI::Queue::Redis::UnresolvedEntry
minitest/queue.rb:~167
Root cause
PR #380 introduced UnresolvedEntry as a defense-in-depth fallback for when resolve_entry finds neither @index nor entry_resolver set. The same PR added configure_lazy_queue to the eager-mode branch of populate_queue so non-leader workers get an entry_resolver.
The gap is in the retry join path. When a worker enters a retry run, it can skip populate_queue entirely -- going straight to the retry queue pop loop. configure_lazy_queue is never called in that path, so entry_resolver stays nil. The next test popped off the queue falls through to the UnresolvedEntry fallback. The runner then calls .run on it and crashes.
With many parallel workers this cascades: the unresolved entry is never acknowledged, gets requeued, and each worker that picks it up crashes the same way.
Conditions
- Lazy-load mode enabled
- Retry run (automatic retry is the most common trigger in CI environments)
- Non-leader worker (the leader goes through
populate_queue and gets entry_resolver set)
Fix direction
Call configure_lazy_queue in the retry join path, not just inside populate_queue. Mirrors the eager-mode fix from #380 for the initial run.
In lazy-load mode, workers on a retry run can end up with
resolve_entryreturning anUnresolvedEntryinstead of a live test object, crashing the test runner:Root cause
PR #380 introduced
UnresolvedEntryas a defense-in-depth fallback for whenresolve_entryfinds neither@indexnorentry_resolverset. The same PR addedconfigure_lazy_queueto the eager-mode branch ofpopulate_queueso non-leader workers get anentry_resolver.The gap is in the retry join path. When a worker enters a retry run, it can skip
populate_queueentirely -- going straight to the retry queue pop loop.configure_lazy_queueis never called in that path, soentry_resolverstays nil. The next test popped off the queue falls through to theUnresolvedEntryfallback. The runner then calls.runon it and crashes.With many parallel workers this cascades: the unresolved entry is never acknowledged, gets requeued, and each worker that picks it up crashes the same way.
Conditions
populate_queueand getsentry_resolverset)Fix direction
Call
configure_lazy_queuein the retry join path, not just insidepopulate_queue. Mirrors the eager-mode fix from #380 for the initial run.