Skip to content

feat: add server-side speech-to-text with OpenAI provider#5596

Open
shatfield4 wants to merge 4 commits into
masterfrom
feat/stt-server-foundation-openai
Open

feat: add server-side speech-to-text with OpenAI provider#5596
shatfield4 wants to merge 4 commits into
masterfrom
feat/stt-server-foundation-openai

Conversation

@shatfield4
Copy link
Copy Markdown
Collaborator

@shatfield4 shatfield4 commented May 8, 2026

Pull Request Type

  • ✨ feat (New feature)
  • 🐛 fix (Bug fix)
  • ♻️ refactor (Code refactoring without changing behavior)
  • 💄 style (UI style changes)
  • 🔨 chore (Build, CI, maintenance)
  • 📝 docs (Documentation updates)

Relevant Issues

resolves #4732
resolves #3885

Description

  • Adds support for server-side speech-to-text so users can transcribe through cloud providers
  • Lays the foundation for upcoming STT provider PRs (Lemonade, OpenAI Compatible, Deepgram)
  • Adds OpenAI as the first server-side STT provider with a dynamic model picker that filters for transcription models
  • Same mic button UX as before: auto-stops on silence and respects the auto-submit setting, just transcribed by the chosen provider instead of the browser

Visuals (if applicable)

Additional Information

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated (if applicable)
  • I have tested my code functionality
  • Docker build succeeds locally

@shatfield4 shatfield4 self-assigned this May 8, 2026
@shatfield4 shatfield4 requested a review from angelplusultra May 9, 2026 00:22
@shatfield4 shatfield4 marked this pull request as ready for review May 9, 2026 00:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT]: Support Deepgram APIs for TTS and STT in AnythingLLM [FEAT]: OpenAI Generic for Transcription, STT & TTS

2 participants