Skip to content

feat: lazy + configurable LLM fallback chain#1762

Merged
felixschmetz merged 6 commits into
mainfrom
feat/lazy-configurable-llm-chain
Apr 23, 2026
Merged

feat: lazy + configurable LLM fallback chain#1762
felixschmetz merged 6 commits into
mainfrom
feat/lazy-configurable-llm-chain

Conversation

@felixschmetz
Copy link
Copy Markdown
Member

@felixschmetz felixschmetz commented Apr 23, 2026

Summary

  • Let the backend boot when no chain-provider API key is configured: _build_llm_chain() returns an UnavailableLLM null-object instead of raising. Instant search works immediately; classic/agentic surface LLMUnavailableError on first use and return HTTP 503 via the existing exception handler.
  • Expose LLM_FALLBACK_CHAIN as an env var (format: provider:model,provider:model) so private-cloud deployers can override the chain without forking. Unset keeps the current [together:zai-glm-5, anthropic:claude-sonnet-4.6] default — no change for Airweave self-serve.
  • Services keep required, non-null llm: LLMProtocol signatures. Null-object carries the "unavailable" state so no Optional threading or None-checks inside search services.

Context

A deployer was blocked on v0.9.65 because only MISTRAL_API_KEY was set, and the hardcoded chain required a Together or Anthropic key. Their actual plan is to use Instant search (no LLM needed); the primary ask was "don't break deployment." A follow-up ask from another deployer: private-cloud customers don't always have the same providers as self-serve, so the chain should be configurable per deployment rather than via a fork.

Test plan

  • New unit tests (17 total): adapters/llm/tests/test_unavailable.py, domains/search/tests/test_config.py, core/container/tests/test_llm_chain_wiring.py
  • Existing unit suite still green (1111 tests), LLM adapter tests green (65), billing listener tests green (23)
  • ruff check clean on changed files; ruff format --check clean; lint-imports green; no new mypy errors
  • Smoke script: factory returns UnavailableLLM with no keys set and logs the expected INFO message; parser accepts mistral:mistral-large and rejects unknown providers/models with the accepted-value list
  • Manual docker compose up with only MISTRAL_API_KEY — confirm /search/instant works, /search/classic and /search/agentic return 503
  • Manual override: LLM_FALLBACK_CHAIN=mistral:mistral-large with MISTRAL_API_KEY — confirm all three search tiers work
  • Manual invalid chain: LLM_FALLBACK_CHAIN=bogus:foo — confirm startup fails with an actionable parser error

Summary by cubic

Make the LLM fallback chain lazy and configurable so the app boots without LLM keys; instant search works, and classic/agentic return HTTP 503 until a key is set or a chain is configured. Also removed unused weaviate and weaviate-client packages to unblock Poetry 2.x Docker builds.

  • New Features

    • Injects a null-object UnavailableLLM when no providers resolve, so boot doesn’t fail; classic/agentic raise LLMUnavailableError mapped to 503.
    • Adds LLM_FALLBACK_CHAIN env var (provider:model,provider:model); parser validates provider names, model names, and valid provider–model pairs at startup.
    • Docs: added and refined “Configuring the LLM provider chain” with API key list, override format, and startup validation notes.
  • Migration

    • Self-serve: no change.
    • Private-cloud: set a provider API key or set LLM_FALLBACK_CHAIN (e.g., mistral:mistral-large) to enable classic/agentic; invalid values fail fast with accepted providers/models and valid combinations listed.

Written for commit ada1033. Summary will update on new commits.

Let deployments boot without any chain-provider API key: when the chain resolves
to zero providers, inject an UnavailableLLM null-object instead of raising.
Instant search keeps working; classic and agentic search surface
LLMUnavailableError on first use, mapped to HTTP 503 via the existing handler.

Also expose LLM_FALLBACK_CHAIN as an env var (format:
"provider:model,provider:model") so private-cloud deployers can override the
chain without forking. Unset falls back to the current hardcoded
[together:zai-glm-5, anthropic:claude-sonnet-4.6] default.
Comment thread backend/airweave/adapters/llm/tests/test_unavailable.py Fixed
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 9 files

- Derive env-var list in LLMUnavailableError from PROVIDER_API_KEY_SETTINGS
  so the default message stays in sync when providers are added.
- Extract _unavailable() helper in _build_llm_chain to dedupe the two
  UnavailableLLM return paths and their log lines.
- Narrow UnavailableLLM.model_spec return type from Any to LLMModelSpec.
- Hoist provider/model lookup dicts to module scope in search config,
  surface accepted values in enum-declaration order in error messages,
  and note that SearchConfig.LLM_FALLBACK_CHAIN is evaluated once at
  class-definition time.
close() is typed -> None, so 'assert await llm.close() is None' trips
mypy's func-returns-value check. The test's intent is that close() is a
safe no-op (does not raise), which a bare await already covers.
pyproject.toml declared both weaviate (^0.1.2) and weaviate-client
(^4.10.2). The former is a PyPI placeholder ("A placeholder package
for the Weaviate name") that ships its own weaviate/__init__.py,
colliding with the real client. Poetry 2.x now fails the Docker image
build with 'Installing weaviate/__init__.py over existing file'; 1.x
silently tolerated the overwrite.

No code imports weaviate — the only repo reference is a URL string in
a test. Removing the placeholder unbreaks the container build.
- parse_llm_fallback_chain now validates (provider, model) pairs against
  MODEL_REGISTRY, so combos like together:mistral-large fail at startup
  with an actionable error instead of crashing later in the factory.
- Move the detailed env-var message out of core.exceptions (which had
  to import from adapters.llm.registry) into UnavailableLLM itself. The
  exception keeps a generic default pointing at the docs.
- Drop the unreachable empty-chain guard in _create_search_services —
  the parser always returns a non-empty list.
- Test the env-var-name assertion dynamically against
  PROVIDER_API_KEY_SETTINGS so it stays in sync with the registry.
- Add a 'Configuring the LLM provider chain' section to the Search docs
  covering API keys, LLM_FALLBACK_CHAIN format, and fallback semantics,
  so the exception's doc reference has a real destination.
Comment thread fern/docs/pages/search.mdx Outdated
Comment thread fern/docs/pages/search.mdx Outdated
Comment thread fern/docs/pages/search.mdx Outdated
Comment thread fern/docs/pages/search.mdx Outdated
Comment thread fern/docs/pages/search.mdx Outdated
Apply @hiddeco's suggestions:
- Lift 'self-hosted only' caveat into a Fern Callout.
- Tighten 'Out of the box, Airweave...' phrasing.
- Rework 'first provider with an API key set that responds...' for
  clarity.
- Reword 'Misconfiguration is caught at startup...'.
- Capitalize 'Classic/Agentic' in the fallback-semantics bullet to
  match the rest of the doc.
@felixschmetz felixschmetz merged commit 32cfea0 into main Apr 23, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants