How-To

MCP + A2A: The Two Standards That Will Eat Voice AI Infrastructure

Workforce Wave

April 17, 20268 min read
#a2a#architecture#developers#mcp

Two protocol standards are quietly reorganizing how AI systems interact with the world. If you're building AI infrastructure in 2026 and neither of these is in your design, you're building for the last generation of AI tooling.

MCP (Model Context Protocol, from Anthropic) and A2A (Agent-to-Agent, from Google) approach the same fundamental problem from different angles: how should AI systems communicate with external capabilities and with each other? The fact that they come from competing labs and are converging on complementary answers tells you something important about where the industry is heading.

This post covers what each standard does, how WFW implements both, and what the combination makes possible.

MCP: The Tool Standard for LLMs

MCP is Anthropic's open protocol for connecting language models to external tools and data sources. If you've used Claude Code, you've used MCP — every @mention of an MCP server in your context is a live tool call, not a retrieval operation.

The core design: an LLM connects to an MCP server, which exposes a catalog of tools with typed input/output schemas. The LLM can call any of those tools, get structured results back, and reason over them. The server handles auth, execution, and response formatting. The LLM handles intent and composition.

For voice AI, this matters because the alternative is prompting an LLM with verbose text descriptions of what voice APIs can do and hoping it generates valid API calls. MCP makes those capabilities first-class: the LLM sees them, understands their schemas, and can call them with confidence rather than guessing at API shapes.

What the WFW MCP server exposes:

14 tools covering the full /v2/ surface:

provision_agent     — POST /v2/agents (with business_url, Workforce Wave-powered)
get_agent           — GET /v2/agents/{id}
list_agents         — GET /v2/agents (paginated, filterable)
update_agent        — PATCH /v2/agents/{id}
get_call_transcript — GET /v2/calls/{id}/transcript
list_calls          — GET /v2/calls (filterable by agent, date, disposition)
get_extractions     — GET /v2/calls/{id}/extractions
sync_knowledge_base — POST /v2/agents/{id}/kb/sync
propose_kb_update   — POST /v2/agents/{id}/kb/propose
list_webhooks       — GET /v2/webhooks
create_webhook      — POST /v2/webhooks
get_operation       — GET /v2/operations/{id}
scout_research      — Fetch agent cards, capability discovery
list_templates      — GET /v2/templates (vertical starter kits)

Installing the server takes one command:

claude mcp add workforcewave -- \
  npx @workforcewave/mcp-server \
  --api-key YOUR_KEY

After that, any Claude session — Code, Claude.ai desktop, or any MCP-compatible orchestrator — can manage voice agents as naturally as it manages files or code. "Provision a dental agent for this URL" becomes a single tool call, not a four-step API walkthrough.

A2A: The Interop Standard for Agents

While MCP handles tool access for LLMs, A2A handles something different: how agents from different systems communicate with each other when one agent needs to invoke another.

Google's A2A spec (now broadly adopted) defines:

  • Agent cards — structured JSON documents describing what an agent can do, at /.well-known/agent.json
  • Task passing — how one agent sends a structured request to another
  • Authentication — how agents establish trust before interacting
  • Capability negotiation — how a calling agent discovers whether the target supports a given request type before committing to a call

The mental model: A2A is to agents what OpenAPI specs are to REST APIs. Before you call an API, you read the spec. Before an AI agent calls another agent, it reads the agent card.

WFW's A2A implementation:

Every WFW agent gets an agent card automatically at:

GET https://api.workforcewave.com/v2/agents/{agent_id}/.well-known/agent.json

A typical dental agent's card:

{
  "agent_id": "agt_lksd_dental_001",
  "spec_version": "a2a-1.0",
  "business_name": "Lakeside Family Dental",
  "phone_number": "+18435551234",
  "capabilities": {
    "voice": true,
    "structured_requests": true,
    "supported_request_types": [
      "appointment_schedule",
      "appointment_reschedule",
      "appointment_cancel",
      "availability_query",
      "hours_inquiry",
      "insurance_inquiry"
    ],
    "auth_required": true,
    "auth_methods": ["bearer_token", "sip_header"],
    "dual_mode": true,
    "detection_latency_ms": 500
  },
  "structured_request_schema": {
    "appointment_reschedule": {
      "required": ["appointment_id", "patient_id"],
      "optional": ["preferred_window", "flexibility", "authorization_token"]
    }
  },
  "response_format": "json",
  "trust_levels": {
    "registered_callers": "full_access",
    "unregistered_callers": "voice_only"
  }
}

Any A2A-compatible calling agent can fetch this card before dialing. It knows whether structured requests are supported, what request types are available, and what auth mechanism to use — before a single byte of audio is transmitted.

The Compound Effect

Separately, MCP and A2A are useful. Together, they create something qualitatively different.

Consider a Claude instance with both the WFW MCP server installed and knowledge of the A2A protocol. What it can now do:

  1. Use MCP to provision an agentprovision_agent tool call with a business URL → agent is live
  2. Use MCP to configure it — update the KB, set policies, configure webhooks
  3. Use scout_research to fetch the agent card — discover the agent's A2A capabilities
  4. Call the agent via A2A — pass structured requests directly, get structured responses

This is a fully autonomous voice workflow, orchestrated by an AI, using two open standards. No human had to click through a dashboard. No custom integration had to be written. The protocols did the work.

A Real Claude Code Session

Here's what a complete autonomous voice agent workflow looks like in a Claude Code session with the WFW MCP server installed:

> Provision a voice agent for Summit Orthodontics at summitortho.com, then
  run a test by scheduling a consultation for a patient named Taylor Reed,
  DOB 1990-08-14, for any available Friday in the next two weeks.

[Claude calls: provision_agent]
→ operation_id: op_xyz123, status: pending, estimated_seconds: 90

[Claude calls: get_operation (polling)]
→ status: active, agent_id: agt_summit_orth_001

[Claude calls: scout_research, agent_id: agt_summit_orth_001]
→ capabilities: {structured_requests: true, supported: ["appointment_schedule", ...]}

[Claude initiates A2A call to the agent's phone number with bearer token]
→ Request: {action: "appointment_schedule", patient: {name: "Taylor Reed", dob: "1990-08-14"}, ...}
→ Response: {status: "confirmed", slot: "2026-06-27T10:00:00", confirmation: "SMT-2026-4411"}

[Claude calls: get_call_transcript, for the just-completed call]
→ Returns full turn-by-turn transcript

Done. Agent provisioned. Test appointment booked. Full transcript available.

That entire workflow — provisioning through live test call — happened without leaving a chat session, without writing API wrapper code, and without any human clicking through a dashboard.

Why This Matters Beyond Convenience

The MCP + A2A combination isn't just a better developer experience. It changes what's possible in agentic systems.

Right now, most multi-agent systems are tightly coupled: agent A knows about agent B because a developer hardcoded that relationship. MCP + A2A make that relationship discoverable and dynamic. An orchestrator can find WFW agents it's never explicitly been told about, read their capabilities, and route tasks to them appropriately.

For enterprise deployments, this means voice AI becomes part of the broader AI tool ecosystem — not a siloed system that only gets used when someone explicitly calls it. An AI that manages operations for a hospital system can discover that each department has a WFW agent, learn what each one supports, and route phone-based tasks automatically based on agent card data.

That's the path where voice infrastructure becomes ambient capability rather than a product you navigate to.

Implementation Details for Builders

If you're building a system that will call WFW agents via A2A, the checklist is:

Before the call:

  • Fetch the agent card at /.well-known/agent.json
  • Verify structuredrequests: true and your request type is in supportedrequest_types
  • Register your calling system via POST /v2/callers/register to get a pre-negotiated bearer token

During the call:

  • Include X-WFW-Bot-Token in SIP headers with your bearer token
  • Send your structured request in the first message after receiving the bot-mode acknowledgment
  • Parse the response JSON — it will always follow the documented schema for your request type

After the call:

  • Poll GET /v2/calls/{id}/extractions for structured outcomes
  • Subscribe to call.extractions_ready webhook if you want push notification instead of polling

The full A2A spec for WFW is in the API reference. The MCP server is at npx @workforcewave/mcp-server.


Next in this series: Build a Voice Agent in 90 Seconds with the WFW API — a step-by-step tutorial from zero to deployed agent, no SDK required.

Share this article

Ready to put AI voice agents to work in your business?

Get a Live Demo — It's Free