fix(llm): fall back when an attempt opens but sends no useful chunks#5793
Open
ATOM00blue wants to merge 1 commit into
Open
fix(llm): fall back when an attempt opens but sends no useful chunks#5793ATOM00blue wants to merge 1 commit into
ATOM00blue wants to merge 1 commit into
Conversation
The LLM FallbackAdapter forwarded attempt_timeout to the underlying LLM as the connection timeout only. A provider could open the connection and then stay silent (no useful chunk and no HTTP error), leaving the adapter blocked forever instead of failing over. Apply attempt_timeout to the time-to-first-useful-chunk as well, so a connected-but-silent attempt times out and falls back to the next LLM. Chunks carrying only metadata (e.g. usage) don't count as progress. Fixes livekit#5660
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The LLM
FallbackAdapteraccepts anattempt_timeout, but it was only forwarded to the underlying LLM as the connection timeout. If a provider opens the connection and then stays silent — no useful chunk and no HTTP error — the adapter blocks indefinitely on the first attempt instead of failing over. This has shown up in practice as provider brown-outs that leave the user hearing silence.This change also applies
attempt_timeoutto the time-to-first-useful-chunk. If no chunk carrying spoken content or tool calls arrives within the timeout, the attempt times out and the adapter falls back to the next LLM (reusing the existingasyncio.TimeoutErrorhandling). Chunks that only carry metadata (e.g. usage) don't count as progress. Once a useful chunk has been received the timeout is dropped, matching the existingretry_on_chunk_sent=Falsebehavior.Test plan
pytest tests/test_llm_fallback.py— new tests fail before the change (adapter hangs on a connected-but-silent LLM) and pass afterAPIConnectionErrorwhen every LLM stays silentruff format/ruff checkclean; existing STT/TTS fallback tests still passFixes #5660