Glossary

RTP (Real-time Transport Protocol)

A network protocol enabling real-time transmission of voice and video data with minimal latency.

Real-time Transport Protocol (RTP) is an internet protocol that transmits voice, video, and other time-sensitive media across networks with minimal latency and packet loss. RTP is the foundation of VoIP (Voice over IP), softphones, and cloud-based voice platforms — including AI voice agents.

How RTP Works

RTP encapsulates audio data into packets with headers containing timing information (timestamps), sequence numbers, and source identifiers. Receivers use these headers to reassemble packets in the correct order and synchronize playback, even if packets arrive out of order or with variable delays.

RTP vs. TCP

  • TCP: Reliable but slow; retransmits lost packets. Not suitable for real-time voice (causes delays).
  • RTP: Fast but best-effort; accepts some packet loss in exchange for low latency. Ideal for voice where a few lost milliseconds are imperceptible but delays > 800ms are noticeable.

Codec and Bandwidth

RTP carries encoded audio (codecs like Opus, G.711, G.729) which compress voice while maintaining quality. A typical RTP voice stream uses 50-128 kbps (vs. uncompressed audio at ~1.4 Mbps).

RTP in AI Voice Agent Calls

When a caller connects to an AI voice agent, the call audio is transmitted via RTP over the internet. The platform routes packets efficiently to minimize latency (sub-800ms) and handles packet loss gracefully (jitter buffers, packet recovery).

Security: SRTP

Standard RTP does not encrypt payload. SRTP (Secure RTP) adds AES encryption, protecting voice content from eavesdropping. Compliance-sensitive industries (healthcare, finance) require SRTP.

See AI Voice Agents in Action

Workforce Wave deploys AI voice agents across healthcare, staffing, and more. Book a 30-minute demo — no pressure, no generic scripts.

Book a Demo