Skip to content

OGG_OPUS (and other compressed formats) crash with RuntimeError — privHeader never set in AudioStreamFormatImpl constructor #981

@raghuram-rng

Description

@raghuram-rng

Bug Description

Using AudioFormatTag.OGG_OPUS (or any other compressed format: WEBM_OPUS, MP3, FLAC, etc.) with AudioInputStream.createPushStream() causes an immediate crash with AzureClientCancelled, reason=Error, code=RuntimeError before any audio is sent.

Root Cause

In AudioStreamFormatImpl constructor (src/sdk/Audio/AudioStreamFormat.ts), the switch statement only sets isWavFormat = true for PCM, ALaw, and MuLaw. Every other format — including OGG_OPUS — falls into default: isWavFormat = false.

The privHeader field is only assigned inside if (isWavFormat), so for OGG_OPUS it is never set and remains undefined at runtime despite the TypeScript type declaration protected privHeader: ArrayBuffer (non-optional) promising otherwise.

switch (format) {
    case AudioFormatTag.PCM:   this.formatTag = 1; break;
    case AudioFormatTag.ALaw:  this.formatTag = 6; break;
    case AudioFormatTag.MuLaw: this.formatTag = 7; break;
    default:
        isWavFormat = false;   // ← OGG_OPUS, WEBM_OPUS, MP3, FLAC, etc. all land here
}
// ...
if (isWavFormat) {
    this.privHeader = new ArrayBuffer(44);  // ← never reached for OGG_OPUS
}
// privHeader === undefined for all compressed formats

When startContinuousRecognitionAsync() fires, ServiceRecognizerBase.sendWaveHeader() reads format.header (which returns privHeader = undefined) and tries to build a SpeechConnectionMessage with it. Internally that message reads payload.byteLength:

TypeError: Cannot read properties of undefined (reading 'byteLength')
  → promise rejects
  → SDK fires canceled event: AzureClientCancelled, reason=Error, code=RuntimeError

Reproduction

import * as sdk from 'microsoft-cognitiveservices-speech-sdk';
import { AudioFormatTag, AudioStreamFormatImpl } from 'microsoft-cognitiveservices-speech-sdk/distrib/lib/src/sdk/Audio/AudioStreamFormat';

const format = new AudioStreamFormatImpl(16000, 16, 1, AudioFormatTag.OGG_OPUS);
console.log(format.header); // undefined — should be ArrayBuffer

const pushStream = sdk.AudioInputStream.createPushStream(format);
const config = sdk.SpeechConfig.fromSubscription('KEY', 'REGION');
const audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
const recognizer = new sdk.SpeechRecognizer(config, audioConfig);

recognizer.canceled = (r, e) => console.log('CANCELLED:', e.errorDetails);
// → "RuntimeError" crash before any audio sent
recognizer.startContinuousRecognitionAsync();

Expected Behaviour

AudioFormatTag.OGG_OPUS is defined in the enum and the constructor accepts it as a parameter. It should not crash. For compressed formats where no WAV header is needed, privHeader should be set to an empty ArrayBuffer(0) so the SDK can safely send a zero-byte header message (which Azure ignores) and then stream the compressed audio data directly.

Proposed Fix

In the constructor, change:

if (isWavFormat) {
    this.privHeader = new ArrayBuffer(44);
    // ... build WAV header
}

To:

if (isWavFormat) {
    this.privHeader = new ArrayBuffer(44);
    // ... build WAV header
} else {
    // Compressed formats (OGG_OPUS, WEBM_OPUS, MP3, FLAC, etc.) don't send a WAV header.
    // Set to empty ArrayBuffer so sendWaveHeader() doesn't crash on payload.byteLength.
    this.privHeader = new ArrayBuffer(0);
}

This is a one-line fix. Alternatively the privHeader field declaration could be changed to protected privHeader: ArrayBuffer = new ArrayBuffer(0) to set a safe default.

Verified Workaround

We have verified that setting privHeader to an empty ArrayBuffer(0) after construction fixes the crash and Azure successfully receives and decodes OGG/Opus audio:

const format = new AudioStreamFormatImpl(48000, 16, 2, AudioFormatTag.OGG_OPUS);
(format as any).privHeader = new ArrayBuffer(0); // workaround for SDK bug
const pushStream = sdk.AudioInputStream.createPushStream(format);
// Works — Azure GStreamer auto-detects OGG container from stream data

Environment

  • SDK version: 1.48.0 (also present in latest master)
  • Runtime: Node.js
  • Platform: macOS / Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions