feat: lazy + configurable LLM fallback chain#1762
Merged
Merged
Conversation
Let deployments boot without any chain-provider API key: when the chain resolves to zero providers, inject an UnavailableLLM null-object instead of raising. Instant search keeps working; classic and agentic search surface LLMUnavailableError on first use, mapped to HTTP 503 via the existing handler. Also expose LLM_FALLBACK_CHAIN as an env var (format: "provider:model,provider:model") so private-cloud deployers can override the chain without forking. Unset falls back to the current hardcoded [together:zai-glm-5, anthropic:claude-sonnet-4.6] default.
- Derive env-var list in LLMUnavailableError from PROVIDER_API_KEY_SETTINGS so the default message stays in sync when providers are added. - Extract _unavailable() helper in _build_llm_chain to dedupe the two UnavailableLLM return paths and their log lines. - Narrow UnavailableLLM.model_spec return type from Any to LLMModelSpec. - Hoist provider/model lookup dicts to module scope in search config, surface accepted values in enum-declaration order in error messages, and note that SearchConfig.LLM_FALLBACK_CHAIN is evaluated once at class-definition time.
close() is typed -> None, so 'assert await llm.close() is None' trips mypy's func-returns-value check. The test's intent is that close() is a safe no-op (does not raise), which a bare await already covers.
d754cf9 to
8c4bdd9
Compare
pyproject.toml declared both weaviate (^0.1.2) and weaviate-client
(^4.10.2). The former is a PyPI placeholder ("A placeholder package
for the Weaviate name") that ships its own weaviate/__init__.py,
colliding with the real client. Poetry 2.x now fails the Docker image
build with 'Installing weaviate/__init__.py over existing file'; 1.x
silently tolerated the overwrite.
No code imports weaviate — the only repo reference is a URL string in
a test. Removing the placeholder unbreaks the container build.
- parse_llm_fallback_chain now validates (provider, model) pairs against MODEL_REGISTRY, so combos like together:mistral-large fail at startup with an actionable error instead of crashing later in the factory. - Move the detailed env-var message out of core.exceptions (which had to import from adapters.llm.registry) into UnavailableLLM itself. The exception keeps a generic default pointing at the docs. - Drop the unreachable empty-chain guard in _create_search_services — the parser always returns a non-empty list. - Test the env-var-name assertion dynamically against PROVIDER_API_KEY_SETTINGS so it stays in sync with the registry. - Add a 'Configuring the LLM provider chain' section to the Search docs covering API keys, LLM_FALLBACK_CHAIN format, and fallback semantics, so the exception's doc reference has a real destination.
0803793 to
f6511ad
Compare
hiddeco
reviewed
Apr 23, 2026
Apply @hiddeco's suggestions: - Lift 'self-hosted only' caveat into a Fern Callout. - Tighten 'Out of the box, Airweave...' phrasing. - Rework 'first provider with an API key set that responds...' for clarity. - Reword 'Misconfiguration is caught at startup...'. - Capitalize 'Classic/Agentic' in the fallback-semantics bullet to match the rest of the doc.
hiddeco
approved these changes
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_build_llm_chain()returns anUnavailableLLMnull-object instead of raising. Instant search works immediately; classic/agentic surfaceLLMUnavailableErroron first use and return HTTP 503 via the existing exception handler.LLM_FALLBACK_CHAINas an env var (format:provider:model,provider:model) so private-cloud deployers can override the chain without forking. Unset keeps the current[together:zai-glm-5, anthropic:claude-sonnet-4.6]default — no change for Airweave self-serve.llm: LLMProtocolsignatures. Null-object carries the "unavailable" state so noOptionalthreading or None-checks inside search services.Context
A deployer was blocked on
v0.9.65because onlyMISTRAL_API_KEYwas set, and the hardcoded chain required a Together or Anthropic key. Their actual plan is to use Instant search (no LLM needed); the primary ask was "don't break deployment." A follow-up ask from another deployer: private-cloud customers don't always have the same providers as self-serve, so the chain should be configurable per deployment rather than via a fork.Test plan
adapters/llm/tests/test_unavailable.py,domains/search/tests/test_config.py,core/container/tests/test_llm_chain_wiring.pyruff checkclean on changed files;ruff format --checkclean;lint-importsgreen; no new mypy errorsUnavailableLLMwith no keys set and logs the expected INFO message; parser acceptsmistral:mistral-largeand rejects unknown providers/models with the accepted-value listdocker compose upwith onlyMISTRAL_API_KEY— confirm/search/instantworks,/search/classicand/search/agenticreturn 503LLM_FALLBACK_CHAIN=mistral:mistral-largewithMISTRAL_API_KEY— confirm all three search tiers workLLM_FALLBACK_CHAIN=bogus:foo— confirm startup fails with an actionable parser errorSummary by cubic
Make the LLM fallback chain lazy and configurable so the app boots without LLM keys; instant search works, and classic/agentic return HTTP 503 until a key is set or a chain is configured. Also removed unused
weaviateandweaviate-clientpackages to unblock Poetry 2.x Docker builds.New Features
UnavailableLLMwhen no providers resolve, so boot doesn’t fail; classic/agentic raiseLLMUnavailableErrormapped to 503.LLM_FALLBACK_CHAINenv var (provider:model,provider:model); parser validates provider names, model names, and valid provider–model pairs at startup.Migration
LLM_FALLBACK_CHAIN(e.g.,mistral:mistral-large) to enable classic/agentic; invalid values fail fast with accepted providers/models and valid combinations listed.Written for commit ada1033. Summary will update on new commits.