Skip to content

Add CI test pipeline with Key Vault secrets and log redaction#165

Merged
rhurey merged 5 commits into
microsoft:masterfrom
mohamed-zaki-coding:feature/ci-pipeline
Mar 10, 2026
Merged

Add CI test pipeline with Key Vault secrets and log redaction#165
rhurey merged 5 commits into
microsoft:masterfrom
mohamed-zaki-coding:feature/ci-pipeline

Conversation

@mohamed-zaki-coding
Copy link
Copy Markdown
Contributor

@mohamed-zaki-coding mohamed-zaki-coding commented Feb 25, 2026

PR: Add CI Test Pipeline with Key Vault Secrets and Log Redaction

Summary

Adds an Azure DevOps CI pipeline step that runs Go tests against the Speech SDK, using subscription credentials securely fetched from Azure Key Vault. Implements multi-layered secret redaction to prevent credential leaks in CI build logs.

Motivation

The upstream Go SDK pipeline (azure-pipelines.yml) builds the SDK but does not run any tests. This PR extends the pipeline to execute go test ./speech with live service credentials, following the same Key Vault and redaction patterns used by the JavaScript SDK CI.

Architecture

Secret Flow

Azure Key Vault (CarbonSDK-CICD)
  └─ CarbonSubscriptionsJson (secret)
       │
       ▼
AzureKeyVault@2 task (generate-subscription-file.yml)
       │
       ▼
file-creator@6 → secrets/test.subscriptions.regions.json
       │
       ▼
load-build-secrets.sh (jq extraction)
       │
       ▼
SPEECH_SUBSCRIPTION_KEY / SPEECH_SUBSCRIPTION_REGION env vars
       │
       ▼
go test ./speech 2>&1 | global_redact (perl streaming filter)

Redaction Layers

Layer Mechanism What It Catches
1. set +x guard Disables bash trace mode before any secret is assigned Prevents + export SPEECH_SUBSCRIPTION_KEY=... from appearing in logs
2. global_redact pipe Perl streaming regex replaces known secrets with *** Catches secrets in Go test output, SDK debug traces, error messages
3. URL-encoded variant Adds URL-encoded key to redaction array Catches secrets embedded in HTTP URLs / query parameters
4. Go-level redactSecrets() In-process string replacement in test teardown Catches secrets in Go's internal memory log buffer before t.Log()

Files Changed (8 files)

New Files

File Description
ci/generate-subscription-file.yml Azure DevOps template: fetches CarbonSubscriptionsJson from Key Vault and writes it to secrets/test.subscriptions.regions.json
ci/load-build-secrets.sh Bash script: loads secrets from JSON via jq, exports env vars, defines redact_input_with (perl filter) and global_redact wrapper

Modified Files

File Change
ci/azure-pipelines.yml Added generate-subscription-file.yml template reference + "Run Go tests" step that sources load-build-secrets.sh and pipes test output through global_redact
speech/translation_recognizer_test.go Added redactSecrets(s string) string helper; used in setup() teardown to sanitize memory log output. Fixed flaky Recognizing event assertions: buffered channels, non-fatal hypothesis checks, increased timeouts
speech/conversation_transcriber_test.go Uses redactSecrets() in setupConversation() teardown; trailing whitespace normalization
speech/speech_recognizer_test.go Fixed flaky Recognizing event assertions: buffered channels, non-fatal hypothesis checks, increased timeouts
.gitignore Added secrets/, test.subscriptions.regions.json, test.certificates.json entries

Upstream References

Key Design Decisions

  1. source instead of bash: The test step uses source ci/load-build-secrets.sh (not bash ci/load-build-secrets.sh) so that exported functions (global_redact, redact_input_with) remain available in the current shell for the pipe.

  2. Perl streaming filter: Uses perl -lpe with IO::Handle autoflush for line-by-line real-time redaction, avoiding buffering delays in CI output.

  3. set +x before secrets: Placed before any secret access to prevent bash trace mode from echoing secret values — even if a developer adds set -x for debugging upstream.

  4. Go-level redactSecrets(): Belt-and-suspenders approach — even though global_redact handles stdout, Go's t.Log() writes to an internal buffer that gets flushed at test end. The Go-level function ensures secrets are scrubbed before they reach the log buffer.

  5. URL-encoded key variant: Subscription keys embedded in WebSocket/HTTP URLs would bypass plain-text redaction. Adding the URL-encoded form to the redaction array covers this attack vector.

Pipeline Triggers

  • Push to master: Automatic build + test
  • Weekly schedule: Saturday 18:00 UTC (cron: "0 18 * * 6")

Testing

Local Verification

$ go build ./...                    # ✅ PASS (exit 0)
$ go test ./speech -run "TestConversation" -v
  TestConversationTranscriberCreation            ✅ PASS (0.03s)
  TestConversationTranscriberProperties          ✅ PASS (0.01s)
  TestConversationTranscriberEvents              ✅ PASS (11.01s)
  TestConversationTranscriberSingleSpeaker       ✅ PASS (1.00s)
  TestConversationTranscriberContinuousRecog     ✅ PASS (7.59s)

$ go test ./speech -run "TestTranslation" -v
  TestTranslationSessionEvents                   ✅ PASS (0.57s)
  TestTranslationRecognizeOnce                   ✅ PASS (0.78s)
  TestTranslationContinuousRecognition           ✅ PASS (1.72s)
  TestTranslationContinuousRecognitionEos        ✅ PASS (1.77s)
  TestTranslationSynthesis                       ✅ PASS (0.69s)

$ go test ./speech -v               # Full suite: 29/29 PASS (33.1s)

Two previously-flaky translation tests (TestTranslationRecognizeOnce, TestTranslationContinuousRecognitionEos) were fixed by buffering event channels and making Recognizing (hypothesis) event assertions non-fatal — the Speech service may skip hypothesis messages for short audio or under load. Verified stable with -count=10 (10/10 PASS).

YAML Validation

Both ci/azure-pipelines.yml and ci/generate-subscription-file.yml validated successfully with Python YAML parser.

Bash Script Validation

ci/load-build-secrets.sh uses set -euo pipefail with proper error handling for missing files, missing jq, and null/empty secret values.

The Speech service may skip Translation.Hypothesis (Recognizing) events
for short audio or under service load, jumping straight to the final
Translation.Phrase (Recognized). Tests were treating these as mandatory,
causing intermittent failures.

Changes:
- Buffer event channels (cap 1 for RecognizeOnce, cap 10 for continuous)
- Wait for Recognized events first (guaranteed by the service)
- Check Recognizing events non-fatally via default/drain pattern
- Increase timeouts from 5s to 10-15s for service variability

Verified stable: 29/29 tests PASS, 10/10 on -count=10 for previously
flaky TestTranslationRecognizeOnce.
@rhurey rhurey merged commit 79d0bc7 into microsoft:master Mar 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants