MCP + A2A: The Two Standards That Will Eat Voice AI Infrastructure
Two protocol standards are quietly reorganizing how AI systems interact with the world. If you're building AI infrastructure in 2026 and neither of these is in your design, you're building for the last generation of AI tooling.
MCP (Model Context Protocol, from Anthropic) and A2A (Agent-to-Agent, from Google) approach the same fundamental problem from different angles: how should AI systems communicate with external capabilities and with each other? The fact that they come from competing labs and are converging on complementary answers tells you something important about where the industry is heading.
This post covers what each standard does, how WFW implements both, and what the combination makes possible.
MCP: The Tool Standard for LLMs
MCP is Anthropic's open protocol for connecting language models to external tools and data sources. If you've used Claude Code, you've used MCP — every @mention of an MCP server in your context is a live tool call, not a retrieval operation.
The core design: an LLM connects to an MCP server, which exposes a catalog of tools with typed input/output schemas. The LLM can call any of those tools, get structured results back, and reason over them. The server handles auth, execution, and response formatting. The LLM handles intent and composition.
For voice AI, this matters because the alternative is prompting an LLM with verbose text descriptions of what voice APIs can do and hoping it generates valid API calls. MCP makes those capabilities first-class: the LLM sees them, understands their schemas, and can call them with confidence rather than guessing at API shapes.
What the WFW MCP server exposes:
14 tools covering the full /v2/ surface:
provision_agent — POST /v2/agents (with business_url, Workforce Wave-powered)
get_agent — GET /v2/agents/{id}
list_agents — GET /v2/agents (paginated, filterable)
update_agent — PATCH /v2/agents/{id}
get_call_transcript — GET /v2/calls/{id}/transcript
list_calls — GET /v2/calls (filterable by agent, date, disposition)
get_extractions — GET /v2/calls/{id}/extractions
sync_knowledge_base — POST /v2/agents/{id}/kb/sync
propose_kb_update — POST /v2/agents/{id}/kb/propose
list_webhooks — GET /v2/webhooks
create_webhook — POST /v2/webhooks
get_operation — GET /v2/operations/{id}
scout_research — Fetch agent cards, capability discovery
list_templates — GET /v2/templates (vertical starter kits)
Installing the server takes one command:
claude mcp add workforcewave -- \
npx @workforcewave/mcp-server \
--api-key YOUR_KEY
After that, any Claude session — Code, Claude.ai desktop, or any MCP-compatible orchestrator — can manage voice agents as naturally as it manages files or code. "Provision a dental agent for this URL" becomes a single tool call, not a four-step API walkthrough.
A2A: The Interop Standard for Agents
While MCP handles tool access for LLMs, A2A handles something different: how agents from different systems communicate with each other when one agent needs to invoke another.
Google's A2A spec (now broadly adopted) defines:
- Agent cards — structured JSON documents describing what an agent can do, at
/.well-known/agent.json - Task passing — how one agent sends a structured request to another
- Authentication — how agents establish trust before interacting
- Capability negotiation — how a calling agent discovers whether the target supports a given request type before committing to a call
The mental model: A2A is to agents what OpenAPI specs are to REST APIs. Before you call an API, you read the spec. Before an AI agent calls another agent, it reads the agent card.
WFW's A2A implementation:
Every WFW agent gets an agent card automatically at:
GET https://api.workforcewave.com/v2/agents/{agent_id}/.well-known/agent.json
A typical dental agent's card:
{
"agent_id": "agt_lksd_dental_001",
"spec_version": "a2a-1.0",
"business_name": "Lakeside Family Dental",
"phone_number": "+18435551234",
"capabilities": {
"voice": true,
"structured_requests": true,
"supported_request_types": [
"appointment_schedule",
"appointment_reschedule",
"appointment_cancel",
"availability_query",
"hours_inquiry",
"insurance_inquiry"
],
"auth_required": true,
"auth_methods": ["bearer_token", "sip_header"],
"dual_mode": true,
"detection_latency_ms": 500
},
"structured_request_schema": {
"appointment_reschedule": {
"required": ["appointment_id", "patient_id"],
"optional": ["preferred_window", "flexibility", "authorization_token"]
}
},
"response_format": "json",
"trust_levels": {
"registered_callers": "full_access",
"unregistered_callers": "voice_only"
}
}
Any A2A-compatible calling agent can fetch this card before dialing. It knows whether structured requests are supported, what request types are available, and what auth mechanism to use — before a single byte of audio is transmitted.
The Compound Effect
Separately, MCP and A2A are useful. Together, they create something qualitatively different.
Consider a Claude instance with both the WFW MCP server installed and knowledge of the A2A protocol. What it can now do:
- Use MCP to provision an agent —
provision_agenttool call with a business URL → agent is live - Use MCP to configure it — update the KB, set policies, configure webhooks
- Use
scout_researchto fetch the agent card — discover the agent's A2A capabilities - Call the agent via A2A — pass structured requests directly, get structured responses
This is a fully autonomous voice workflow, orchestrated by an AI, using two open standards. No human had to click through a dashboard. No custom integration had to be written. The protocols did the work.
A Real Claude Code Session
Here's what a complete autonomous voice agent workflow looks like in a Claude Code session with the WFW MCP server installed:
> Provision a voice agent for Summit Orthodontics at summitortho.com, then
run a test by scheduling a consultation for a patient named Taylor Reed,
DOB 1990-08-14, for any available Friday in the next two weeks.
[Claude calls: provision_agent]
→ operation_id: op_xyz123, status: pending, estimated_seconds: 90
[Claude calls: get_operation (polling)]
→ status: active, agent_id: agt_summit_orth_001
[Claude calls: scout_research, agent_id: agt_summit_orth_001]
→ capabilities: {structured_requests: true, supported: ["appointment_schedule", ...]}
[Claude initiates A2A call to the agent's phone number with bearer token]
→ Request: {action: "appointment_schedule", patient: {name: "Taylor Reed", dob: "1990-08-14"}, ...}
→ Response: {status: "confirmed", slot: "2026-06-27T10:00:00", confirmation: "SMT-2026-4411"}
[Claude calls: get_call_transcript, for the just-completed call]
→ Returns full turn-by-turn transcript
Done. Agent provisioned. Test appointment booked. Full transcript available.
That entire workflow — provisioning through live test call — happened without leaving a chat session, without writing API wrapper code, and without any human clicking through a dashboard.
Why This Matters Beyond Convenience
The MCP + A2A combination isn't just a better developer experience. It changes what's possible in agentic systems.
Right now, most multi-agent systems are tightly coupled: agent A knows about agent B because a developer hardcoded that relationship. MCP + A2A make that relationship discoverable and dynamic. An orchestrator can find WFW agents it's never explicitly been told about, read their capabilities, and route tasks to them appropriately.
For enterprise deployments, this means voice AI becomes part of the broader AI tool ecosystem — not a siloed system that only gets used when someone explicitly calls it. An AI that manages operations for a hospital system can discover that each department has a WFW agent, learn what each one supports, and route phone-based tasks automatically based on agent card data.
That's the path where voice infrastructure becomes ambient capability rather than a product you navigate to.
Implementation Details for Builders
If you're building a system that will call WFW agents via A2A, the checklist is:
Before the call:
- Fetch the agent card at
/.well-known/agent.json - Verify
structuredrequests: trueand your request type is insupportedrequest_types - Register your calling system via
POST /v2/callers/registerto get a pre-negotiated bearer token
During the call:
- Include
X-WFW-Bot-Tokenin SIP headers with your bearer token - Send your structured request in the first message after receiving the bot-mode acknowledgment
- Parse the response JSON — it will always follow the documented schema for your request type
After the call:
- Poll
GET /v2/calls/{id}/extractionsfor structured outcomes - Subscribe to
call.extractions_readywebhook if you want push notification instead of polling
The full A2A spec for WFW is in the API reference. The MCP server is at npx @workforcewave/mcp-server.
Next in this series: Build a Voice Agent in 90 Seconds with the WFW API — a step-by-step tutorial from zero to deployed agent, no SDK required.
Ready to put AI voice agents to work in your business?
Get a Live Demo — It's FreeContinue Reading
Related Articles
The Hotel AI Called the Restaurant AI: A Story About What's Coming
When a hotel concierge AI needs to book a table at the hotel restaurant, it calls the restaurant's phone number — and the restaurant has a WFW agent. What happens? This is the A2A story.
Caller-Type Detection at 500ms: How to Tell a Human from an AI Mid-Call
When an inbound call arrives, your voice agent has under 500ms to decide whether it's talking to a human or another AI system — before generating a single word. Here's how WFW's dual-mode detection works.
Rate Limiting and Idempotency: What Your Bot Needs to Know
The two most important API patterns for AI consumers of the WFW API — with concrete examples and a production-ready TypeScript client.