Glossary

STT (Speech-to-Text)

The technology that converts spoken audio into written text transcripts in real time.

Speech-to-Text (STT), also called Automatic Speech Recognition (ASR), is the technology that converts spoken audio into written text transcripts. STT is the first critical stage of every AI voice agent pipeline — the accuracy and speed of STT directly impact downstream intent detection and response quality.

Modern STT Engines

  • Google Cloud Speech-to-Text: 95%+ accuracy; supports 125+ languages and dialects.
  • OpenAI Whisper: 95%+ accuracy; open-source; excellent on accented speech and background noise.
  • Amazon Transcribe: 95%+ accuracy; integrates with AWS services.
  • Microsoft Azure Speech Services: 95%+ accuracy; enterprise support.

Real-Time vs. Batch STT

  • Real-time (streaming): Transcription happens as the caller speaks; enables immediate response generation. Slightly higher latency but essential for conversation.
  • Batch: Transcription after call completes; used for post-call analytics and quality assurance, not live conversation.

Accuracy Factors

  • Audio quality: Clear audio (no background noise) = 99%+ accuracy; poor line quality = 85-90%.
  • Acoustic model: Model trained on similar speakers and accents performs better.
  • Language model: Domain-specific vocabulary (medical, legal, technical terms) improves accuracy with custom language models.

STT in Voice Agent Pipelines

Typical STT processing takes 100-300ms. High-quality voice agents deploy streaming STT so intent detection and response generation can begin before the caller finishes speaking, reducing perceived latency.

Workforce Wave STT

Workforce Wave uses streaming STT from industry-leading providers, fine-tuned for voice agent conversations. Custom language models are available for industry-specific vocabulary (healthcare, finance, HR).

See AI Voice Agents in Action

Workforce Wave deploys AI voice agents across healthcare, staffing, and more. Book a 30-minute demo — no pressure, no generic scripts.

Book a Demo