feat: lazy + configurable LLM fallback chain by felixschmetz · Pull Request #1762 · airweave-ai/airweave

felixschmetz · 2026-04-23T07:53:12Z

Summary

Let the backend boot when no chain-provider API key is configured: _build_llm_chain() returns an UnavailableLLM null-object instead of raising. Instant search works immediately; classic/agentic surface LLMUnavailableError on first use and return HTTP 503 via the existing exception handler.
Expose LLM_FALLBACK_CHAIN as an env var (format: provider:model,provider:model) so private-cloud deployers can override the chain without forking. Unset keeps the current [together:zai-glm-5, anthropic:claude-sonnet-4.6] default — no change for Airweave self-serve.
Services keep required, non-null llm: LLMProtocol signatures. Null-object carries the "unavailable" state so no Optional threading or None-checks inside search services.

Context

A deployer was blocked on v0.9.65 because only MISTRAL_API_KEY was set, and the hardcoded chain required a Together or Anthropic key. Their actual plan is to use Instant search (no LLM needed); the primary ask was "don't break deployment." A follow-up ask from another deployer: private-cloud customers don't always have the same providers as self-serve, so the chain should be configurable per deployment rather than via a fork.

Test plan

New unit tests (17 total): adapters/llm/tests/test_unavailable.py, domains/search/tests/test_config.py, core/container/tests/test_llm_chain_wiring.py
Existing unit suite still green (1111 tests), LLM adapter tests green (65), billing listener tests green (23)
ruff check clean on changed files; ruff format --check clean; lint-imports green; no new mypy errors
Smoke script: factory returns UnavailableLLM with no keys set and logs the expected INFO message; parser accepts mistral:mistral-large and rejects unknown providers/models with the accepted-value list
Manual docker compose up with only MISTRAL_API_KEY — confirm /search/instant works, /search/classic and /search/agentic return 503
Manual override: LLM_FALLBACK_CHAIN=mistral:mistral-large with MISTRAL_API_KEY — confirm all three search tiers work
Manual invalid chain: LLM_FALLBACK_CHAIN=bogus:foo — confirm startup fails with an actionable parser error

Summary by cubic

Make the LLM fallback chain lazy and configurable so the app boots without LLM keys; instant search works, and classic/agentic return HTTP 503 until a key is set or a chain is configured. Also removed unused weaviate and weaviate-client packages to unblock Poetry 2.x Docker builds.

New Features
- Injects a null-object UnavailableLLM when no providers resolve, so boot doesn’t fail; classic/agentic raise LLMUnavailableError mapped to 503.
- Adds LLM_FALLBACK_CHAIN env var (provider:model,provider:model); parser validates provider names, model names, and valid provider–model pairs at startup.
- Docs: added and refined “Configuring the LLM provider chain” with API key list, override format, and startup validation notes.
Migration
- Self-serve: no change.
- Private-cloud: set a provider API key or set LLM_FALLBACK_CHAIN (e.g., mistral:mistral-large) to enable classic/agentic; invalid values fail fast with accepted providers/models and valid combinations listed.

^{Written for commit ada1033. Summary will update on new commits.}

Let deployments boot without any chain-provider API key: when the chain resolves to zero providers, inject an UnavailableLLM null-object instead of raising. Instant search keeps working; classic and agentic search surface LLMUnavailableError on first use, mapped to HTTP 503 via the existing handler. Also expose LLM_FALLBACK_CHAIN as an env var (format: "provider:model,provider:model") so private-cloud deployers can override the chain without forking. Unset falls back to the current hardcoded [together:zai-glm-5, anthropic:claude-sonnet-4.6] default.

cubic-dev-ai

No issues found across 9 files

- Derive env-var list in LLMUnavailableError from PROVIDER_API_KEY_SETTINGS so the default message stays in sync when providers are added. - Extract _unavailable() helper in _build_llm_chain to dedupe the two UnavailableLLM return paths and their log lines. - Narrow UnavailableLLM.model_spec return type from Any to LLMModelSpec. - Hoist provider/model lookup dicts to module scope in search config, surface accepted values in enum-declaration order in error messages, and note that SearchConfig.LLM_FALLBACK_CHAIN is evaluated once at class-definition time.

close() is typed -> None, so 'assert await llm.close() is None' trips mypy's func-returns-value check. The test's intent is that close() is a safe no-op (does not raise), which a bare await already covers.

pyproject.toml declared both weaviate (^0.1.2) and weaviate-client (^4.10.2). The former is a PyPI placeholder ("A placeholder package for the Weaviate name") that ships its own weaviate/__init__.py, colliding with the real client. Poetry 2.x now fails the Docker image build with 'Installing weaviate/__init__.py over existing file'; 1.x silently tolerated the overwrite. No code imports weaviate — the only repo reference is a URL string in a test. Removing the placeholder unbreaks the container build.

- parse_llm_fallback_chain now validates (provider, model) pairs against MODEL_REGISTRY, so combos like together:mistral-large fail at startup with an actionable error instead of crashing later in the factory. - Move the detailed env-var message out of core.exceptions (which had to import from adapters.llm.registry) into UnavailableLLM itself. The exception keeps a generic default pointing at the docs. - Drop the unreachable empty-chain guard in _create_search_services — the parser always returns a non-empty list. - Test the env-var-name assertion dynamically against PROVIDER_API_KEY_SETTINGS so it stays in sync with the registry. - Add a 'Configuring the LLM provider chain' section to the Search docs covering API keys, LLM_FALLBACK_CHAIN format, and fallback semantics, so the exception's doc reference has a real destination.

@hiddeco

Apply @hiddeco's suggestions: - Lift 'self-hosted only' caveat into a Fern Callout. - Tighten 'Out of the box, Airweave...' phrasing. - Rework 'first provider with an API key set that responds...' for clarity. - Reword 'Misconfiguration is caught at startup...'. - Capitalize 'Classic/Agentic' in the fallback-semantics bullet to match the rest of the doc.

felixschmetz temporarily deployed to dev April 23, 2026 07:53 — with GitHub Actions Inactive

felixschmetz had a problem deploying to dev April 23, 2026 07:55 — with GitHub Actions Failure

felixschmetz temporarily deployed to dev April 23, 2026 07:55 — with GitHub Actions Inactive

github-advanced-security AI found potential problems Apr 23, 2026

View reviewed changes

Comment thread backend/airweave/adapters/llm/tests/test_unavailable.py Fixed

cubic-dev-ai Bot reviewed Apr 23, 2026

View reviewed changes

felixschmetz temporarily deployed to dev April 23, 2026 08:19 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 08:22 — with GitHub Actions Inactive

fix(tests): drop redundant assertion on UnavailableLLM.close()

8c4bdd9

close() is typed -> None, so 'assert await llm.close() is None' trips mypy's func-returns-value check. The test's intent is that close() is a safe no-op (does not raise), which a bare await already covers.

felixschmetz force-pushed the feat/lazy-configurable-llm-chain branch from d754cf9 to 8c4bdd9 Compare April 23, 2026 08:24

felixschmetz temporarily deployed to dev April 23, 2026 08:24 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 08:36 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 08:38 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 08:52 — with GitHub Actions Inactive

felixschmetz force-pushed the feat/lazy-configurable-llm-chain branch from 0803793 to f6511ad Compare April 23, 2026 08:54

felixschmetz temporarily deployed to dev April 23, 2026 08:54 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 08:57 — with GitHub Actions Inactive

hiddeco reviewed Apr 23, 2026

View reviewed changes

Comment thread fern/docs/pages/search.mdx Outdated

Comment thread fern/docs/pages/search.mdx Outdated

Comment thread fern/docs/pages/search.mdx Outdated

Comment thread fern/docs/pages/search.mdx Outdated

Comment thread fern/docs/pages/search.mdx Outdated

felixschmetz temporarily deployed to dev April 23, 2026 09:05 — with GitHub Actions Inactive

felixschmetz temporarily deployed to dev April 23, 2026 09:08 — with GitHub Actions Inactive

felixschmetz had a problem deploying to dev April 23, 2026 09:08 — with GitHub Actions Failure

felixschmetz temporarily deployed to dev April 23, 2026 09:08 — with GitHub Actions Inactive

hiddeco approved these changes Apr 23, 2026

View reviewed changes

felixschmetz merged commit 32cfea0 into main Apr 23, 2026
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: lazy + configurable LLM fallback chain#1762

feat: lazy + configurable LLM fallback chain#1762
felixschmetz merged 6 commits into
mainfrom
feat/lazy-configurable-llm-chain

felixschmetz commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

felixschmetz commented Apr 23, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by cubic

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

felixschmetz commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading