Feature request: low-latency raw audio stream for agent/action use cases

## Request

Please expose a supported low-latency audio stream from Bee, ideally as raw audio frames/chunks or an equivalent realtime websocket/webhook/API. This would make Bee usable as the front end for personal agents and action automation, not only retrospective notes/transcripts.

## Current limitation

Today the public CLI path appears to be transcript-first:

- `bee stream --json --types new-utterance` emits utterance events, but the realtime docs describe the stream as at-most-once delivery.
- `bee now --json` can backfill, but it is polling-oriented and can lag behind speech.
- In practice, action workflows that wait for processed utterances can miss commands or arrive too late for voice-assistant UX.

For my setup, Bee transcribes speech, a VPS ingests it, and a local agent routes explicit wake-word commands like "Hermes ..." to approved actions. This works for slow tasks, but the connection is brittle for Jarvis-style voice control because the system cannot get audio bytes or low-latency transcript deltas directly.

## Desired API shape

Any one of these would help:

- WebSocket or webhook delivering PCM/Opus/AAC audio chunks with timestamps.
- Realtime transcript deltas with stable utterance IDs and delivery acknowledgements.
- A local/device stream from the Bee CLI that can be consumed by an agent process.
- Clear latency target and ordering/deduplication semantics.

Ideal target: sub-second to a few seconds end-to-end from speech to agent callback. Raw audio access would also allow users to run their own wake-word detection, ASR, latency measurement, and fallback routing without waiting for post-processing.

## Safety/use case

This is for the owner’s own Bee device and opt-in personal automation. The agent side can keep a wake-word gate plus an allowlist of actions. I am not asking for other users’ data or bypassing consent controls, just an official way to stream my own captured audio or lower-latency speech events into my own automation stack.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: low-latency raw audio stream for agent/action use cases #4

Request

Current limitation

Desired API shape

Safety/use case

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature request: low-latency raw audio stream for agent/action use cases #4

Description

Request

Current limitation

Desired API shape

Safety/use case

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions