fix(code-index): limit embedding batch size by item count to prevent provider 422 errors#430
Conversation
…provider 422 errors Embedders batched by token count only (MAX_BATCH_TOKENS=100000). When many small code segments are sent, they all fit in one token-based batch (e.g., 60 segments × ~100 tokens = 6000 < 100000), resulting in API calls with 60+ items. Providers like qwen3-embedding via LiteLLM enforce a maximum of 32 items per call, returning HTTP 422. Added MAX_BATCH_ITEMS=32 constant and applied it as an additional batching constraint alongside the existing token limit in all four embedders: - OpenAICompatibleEmbedder - OpenAIEmbedder - OpenRouterEmbedder - BedrockEmbedder The setting codeIndex.embeddingBatchSize (scanner-level) remains configurable and defaults to 60, but the embedder-level limit ensures no single API call exceeds 32 items regardless of scanner batch size. Closes: Zoo-Code-Org#335
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughA maximum embedding batch item limit of 32 is defined as a constant and enforced across Bedrock, OpenAI, OpenAI-compatible, and OpenRouter embedders. Each embedder's batch-building logic now checks both token capacity and item-count capacity before adding texts to a batch. ChangesEmbedding batch item limits
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
src/services/code-index/constants/index.tsESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox. src/services/code-index/embedders/bedrock.tsESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox. src/services/code-index/embedders/openai-compatible.tsESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Related GitHub Issue
Closes: #335
Description
Root cause: Embedders batched by token count only (
MAX_BATCH_TOKENS=100000). When many small code segments are sent, they all fit in one token-based batch (e.g., 60 segments × ~100 tokens = 6000 < 100000), resulting in API calls with 60+ items. Providers likeqwen3-embeddingvia LiteLLM enforce a maximum of 32 items per call, returning HTTP 422.Fix:
MAX_BATCH_ITEMS = 32constant insrc/services/code-index/constants/index.tsOpenAICompatibleEmbedder— the one reported in [BUG] Batch Size Exceeded #335OpenAIEmbedderOpenRouterEmbedderBedrockEmbedderThe batching condition changed from:
to:
The existing
codeIndex.embeddingBatchSizesetting (scanner-level, default 60) remains configurable. The embedder-level limit ensures no single API call exceeds 32 items regardless of scanner batch size.Test Procedure
Pre-Submission Checklist
Summary by CodeRabbit