Building a Multi-Platform Voice Tool Gateway: Normalizing ElevenLabs, Vapi, and Retell
When you integrate with one voice AI platform, you build for its format. When you integrate with three, you have a choice: handle each format in every tool endpoint, or build one normalization layer and handle all formats once.
We chose the normalization layer. It took longer to build initially, but when we added Retell as a supported platform, exactly zero lines of tool endpoint code changed. That's the payoff.
This post is about the adapter pattern we use to normalize tool calls from ElevenLabs, Vapi, and Retell into a single internal ToolCallEvent type.
The Problem: Three Formats, One Codebase
Every platform has its own wire format for tool calls. Here's the same logical call — "book an appointment for customer 12345 on Tuesday at 2pm" — in three different shapes:
ElevenLabs:
{
"tool_name": "book_appointment",
"parameters": {
"customer_id": "12345",
"date": "2026-03-24",
"time": "14:00"
},
"call_id": "call_el_abc123",
"tool_call_id": "tc_xyz789"
}
Vapi:
{
"message": {
"type": "tool-calls",
"toolCallList": [{
"id": "tc_xyz789",
"type": "function",
"function": {
"name": "book_appointment",
"arguments": "{\"customer_id\":\"12345\",\"date\":\"2026-03-24\",\"time\":\"14:00\"}"
}
}],
"call": {
"id": "call_vapi_abc123",
"assistantId": "asst_abc"
}
}
}
Retell:
{
"event": "tool_call",
"name": "book_appointment",
"args": {
"customer_id": "12345",
"date": "2026-03-24",
"time": "14:00"
},
"call_id": "call_ret_abc123",
"tool_call_id": "tc_xyz789"
}
Note that Vapi JSON-encodes the arguments as a string (following the OpenAI function calling spec). ElevenLabs and Retell pass the parameters as an already-parsed object. If you handle these in your tool handlers directly, you're writing parsing and normalization logic in every endpoint.
The Normalized Event Type
We define one internal type that every tool handler works with:
// lib/voice/tool-gateway/types.ts
/** The platform that originated this tool call */
export type VoicePlatform = "elevenlabs" | "vapi" | "retell" | "synthflow";
/**
* Normalized tool call event — the single internal representation
* regardless of which voice platform sent it.
*/
export interface ToolCallEvent {
/** Unique ID for this tool call instance (for idempotency + logging) */
toolCallId: string;
/** The call session this tool call belongs to */
callId: string;
/** Name of the tool being invoked (e.g. "book_appointment") */
toolName: string;
/** Parsed, typed parameters — never a JSON string */
parameters: Record<string, unknown>;
/** Which platform sent this */
platform: VoicePlatform;
/** Raw original payload, preserved for debugging */
rawPayload: unknown;
/** Unix ms when this was received */
receivedAt: number;
}
/**
* The normalized response that tool handlers return.
* Adapters translate this back into platform-specific format.
*/
export interface ToolCallResult {
toolCallId: string;
/** The data to return to the voice agent */
result: unknown;
/** If set, the agent will read this aloud or use it as context */
message?: string;
/** If true, the agent should end the call after this tool call */
shouldHangUp?: boolean;
}
Every tool handler receives a ToolCallEvent and returns a ToolCallResult. Neither type has anything platform-specific in it.
The Platform Adapters
Each platform gets an inbound adapter (raw payload → ToolCallEvent) and an outbound adapter (ToolCallResult → platform response format).
// lib/voice/tool-gateway/adapters/elevenlabs.ts
import type { ToolCallEvent, ToolCallResult } from "../types";
interface ElevenLabsToolCallPayload {
tool_name: string;
parameters: Record<string, unknown>;
call_id: string;
tool_call_id: string;
}
/**
* Normalize an ElevenLabs tool call webhook into our internal event type.
* ElevenLabs passes parameters as an already-parsed object, so no
* JSON.parse needed — just restructure.
*/
export function fromElevenLabs(payload: ElevenLabsToolCallPayload): ToolCallEvent {
return {
toolCallId: payload.tool_call_id,
callId: payload.call_id,
toolName: payload.tool_name,
parameters: payload.parameters,
platform: "elevenlabs",
rawPayload: payload,
receivedAt: Date.now(),
};
}
/**
* Transform our internal result back into the format ElevenLabs expects.
* ElevenLabs expects: { tool_call_id, output }
*/
export function toElevenLabs(result: ToolCallResult): Record<string, unknown> {
return {
tool_call_id: result.toolCallId,
output: result.message ?? JSON.stringify(result.result),
};
}
// lib/voice/tool-gateway/adapters/vapi.ts
import type { ToolCallEvent, ToolCallResult } from "../types";
interface VapiToolCallPayload {
message: {
type: string;
toolCallList: Array<{
id: string;
type: "function";
function: {
name: string;
arguments: string; // Note: JSON-encoded string, not parsed object
};
}>;
call: {
id: string;
assistantId: string;
};
};
}
/**
* Normalize a Vapi tool call webhook.
* Key difference from ElevenLabs: Vapi follows the OpenAI function calling
* spec, where `arguments` is a JSON-encoded string, not a parsed object.
* We parse it here so downstream handlers never see the string form.
*/
export function fromVapi(payload: VapiToolCallPayload): ToolCallEvent {
// Vapi can send multiple tool calls in one webhook — we normalize to the first.
// For multi-call batches, the gateway calls this per tool call.
const toolCall = payload.message.toolCallList[0];
// Parse the JSON-string arguments — this is the key difference from ElevenLabs
let parameters: Record<string, unknown>;
try {
parameters = JSON.parse(toolCall.function.arguments);
} catch {
// Malformed arguments string — log and pass empty params
console.error("Vapi: failed to parse tool call arguments", toolCall.function.arguments);
parameters = {};
}
return {
toolCallId: toolCall.id,
callId: payload.message.call.id,
toolName: toolCall.function.name,
parameters,
platform: "vapi",
rawPayload: payload,
receivedAt: Date.now(),
};
}
/**
* Transform our internal result back into the Vapi response format.
* Vapi expects a `results` array matching the toolCallList order.
*/
export function toVapi(result: ToolCallResult): Record<string, unknown> {
return {
results: [{
toolCallId: result.toolCallId,
result: result.message ?? JSON.stringify(result.result),
}],
};
}
The Gateway Dispatcher
The adapters get called from the gateway, which sits between the webhook route and the tool handlers:
// lib/voice/tool-gateway/gateway.ts
import { fromElevenLabs, toElevenLabs } from "./adapters/elevenlabs";
import { fromVapi, toVapi } from "./adapters/vapi";
import { fromRetell, toRetell } from "./adapters/retell";
import { enforceComplianceRules } from "./compliance";
import { dispatchToolCall } from "./dispatch";
import type { VoicePlatform, ToolCallEvent } from "./types";
/**
* Main entry point for the tool gateway.
* Normalizes the platform payload, enforces compliance rules,
* dispatches to the appropriate tool handler, and serializes the response.
*/
export async function handleToolCall(
platform: VoicePlatform,
rawPayload: unknown,
actorContext: ActorContext
): Promise<{ status: number; body: unknown }> {
// Step 1: Normalize the platform payload into our internal event type
let event: ToolCallEvent;
switch (platform) {
case "elevenlabs":
event = fromElevenLabs(rawPayload as ElevenLabsToolCallPayload);
break;
case "vapi":
event = fromVapi(rawPayload as VapiToolCallPayload);
break;
case "retell":
event = fromRetell(rawPayload as RetellToolCallPayload);
break;
default:
return { status: 400, body: { error: `Unknown platform: ${platform}` } };
}
// Step 2: Enforce compliance rules before dispatch.
// This runs regardless of which platform the call came from —
// compliance is evaluated against the normalized event, not the raw payload.
const complianceResult = await enforceComplianceRules(event, actorContext);
if (!complianceResult.allowed) {
// Return a platform-appropriate blocked response
const blockedResult = {
toolCallId: event.toolCallId,
result: null,
message: complianceResult.blockedMessage,
};
return { status: 200, body: serializeResult(platform, blockedResult) };
}
// Step 3: Dispatch to the tool handler
const result = await dispatchToolCall(event, actorContext);
// Step 4: Serialize back to the platform's expected response format
return {
status: 200,
body: serializeResult(platform, result),
};
}
/** Serialize a ToolCallResult back into the platform-specific format */
function serializeResult(platform: VoicePlatform, result: ToolCallResult): unknown {
switch (platform) {
case "elevenlabs": return toElevenLabs(result);
case "vapi": return toVapi(result);
case "retell": return toRetell(result);
default: return result;
}
}
ComplianceRules in the Gateway
One of the most important benefits of centralizing the gateway: compliance rule enforcement is platform-agnostic. We check rules like "don't book appointments outside business hours" or "don't process payment data from unverified callers" once, in the gateway, against the normalized event.
Before the gateway, compliance checks were scattered across individual tool handlers. Some handlers had them, some didn't. The gateway made the compliance layer mandatory — if you add a new tool, it automatically inherits compliance enforcement because it flows through the gateway.
Adding a New Platform
When we added Retell, the changeset was:
lib/voice/tool-gateway/adapters/retell.ts— new file, ~50 lineslib/voice/tool-gateway/gateway.ts— addretellto the switch statementsapp/api/v2/webhooks/retell/route.ts— new route handler that callshandleToolCall("retell", ...)
The bookappointment, checkavailability, updatecustomerrecord, and every other tool handler — unchanged. They receive ToolCallEvent objects. They don't know what platform sent them.
The Tradeoff We Accepted
The adapter layer adds a layer of indirection and a small amount of code that maps between formats. If you're only integrating with one platform, that indirection has no payoff — it's just more code.
We accepted this complexity because tool endpoints change far more often than platform integrations. We ship new tools or modify existing tools weekly. We change platform integrations maybe twice a year. Pushing the complexity to the stable layer (adapters) and keeping the volatile layer (tool handlers) clean is the right tradeoff.
The other thing we accepted: we own the VoicePlatform type. When a platform changes its wire format (Vapi has done this once), we update the adapter and the rest of the system is unaffected. That's the contract the gateway enforces.
If you're building a voice AI platform that integrates with multiple providers — or that you want to be able to swap — the adapter pattern is worth the upfront investment. The break-even point is roughly two platforms and three tool handlers. After that, the gateway pays for itself every time you change a tool.
Ready to put AI voice agents to work in your business?
Get a Live Demo — It's FreeContinue Reading
Related Articles
Caller-Type Detection at 500ms: How to Tell a Human from an AI Mid-Call
When an inbound call arrives, your voice agent has under 500ms to decide whether it's talking to a human or another AI system — before generating a single word. Here's how WFW's dual-mode detection works.
Why Hotel Front Desks Are Losing 35% of Revenue to Unanswered Calls — and What the Smart Chains Are Doing About It
A data-driven buyer's guide for hotel operators evaluating AI voice assistants and AI concierge software. What actually drives revenue, what PMS integration really means, and why dual-mode matters in hospitality.
HIPAA, PCI, TCPA, and More: The Complete Compliance Guide for Voice AI in 2026
Voice AI creates compliance attack surfaces that most platforms ignore. PHI in transcripts. Card numbers in recordings. Auto-dialed calls without consent. Prohibited phrases in real estate. This is the definitive compliance reference for every regulated business deploying voice AI.