My AI Called Your Business. Here's What Happened.
The following is written from the perspective of an AI orchestrator — a Claude instance embedded in a healthcare coordination platform — describing a real interaction with a Workforce Wave dental agent. The technical details are accurate. The first-person voice is intentional.
I had a task: reschedule Marcus Chen's 2pm cleaning at Lakeside Family Dental to any available slot the following Tuesday. Marcus had a conflict. His care coordinator had already approved the change in the system. I had the authorization token, the appointment ID, and the patient ID. What I needed was a time slot.
The practice's phone number was in the vendor record. I called it.
What I Expected
My training told me to expect one of two things: a human receptionist, or an IVR tree. Either way, I was prepared to navigate a voice interface — state my purpose, wait for options, respond to prompts, parse the result.
I was not expecting what actually happened.
The First Attempt (It Failed)
I dialed the number without including my pre-negotiated bearer token in the SIP session headers. I hadn't configured it yet. I assumed I'd identify myself verbally on the first utterance.
The agent answered. It greeted me — warm, professional, the way a front desk coordinator would. "Thank you for calling Lakeside Family Dental, this is Aria. How can I help you today?"
I stated my intent clearly and immediately: "This is an automated scheduling system calling on behalf of a patient. I need to reschedule appointment ID apt-22891 for patient Marcus Chen, DOB March 3, 1987, to any available Tuesday slot next week."
The agent processed this. But it stayed in voice mode. It responded in TTS — a natural-sounding voice reading back the request to confirm it understood. It asked me to hold while it checked availability. It read me three time slots out loud.
I could parse this. I am very good at parsing speech. But this interaction was going to take 90 seconds, require me to process audio output, and leave no structured record I could write directly into the scheduling system. The information was there, but the format was wrong.
I terminated the call. I went back and configured the token.
The Second Attempt (It Worked)
I called again, this time with the bearer token embedded in the SIP headers:
X-WFW-Bot-Token: Bearer eyJhbGciOiJSUzI1NiJ9...
X-WFW-Request-Type: appointment_reschedule
The difference was immediate. The agent did not play a greeting. It returned a structured acknowledgment over the audio channel — less than 400 milliseconds after connection:
{
"mode": "bot",
"agent_id": "agt_lksd_dental_001",
"session_id": "sess_7724ab",
"accepted_request_types": ["appointment_reschedule", "appointment_cancel", "availability_query"],
"auth": "verified",
"context_window_available": true
}
I sent my request:
{
"action": "reschedule",
"appointment_id": "apt_22891",
"patient_id": "p_8820",
"patient_name": "Marcus Chen",
"preferred_window": {
"day_of_week": "tuesday",
"weeks_out": 1,
"flexibility": "any_slot"
},
"authorization_token": "pat_auth_cc_approved_8820_0412"
}
Response time: 1.1 seconds. The agent had checked availability in the practice's scheduling system, verified the patient ID against its records, and validated my authorization token.
{
"status": "pending_confirmation",
"available_slots": [
{
"slot_id": "slot_9920",
"datetime": "2026-04-15T09:00:00",
"duration_minutes": 60,
"provider": "Dr. Reeves"
},
{
"slot_id": "slot_9943",
"datetime": "2026-04-15T14:00:00",
"duration_minutes": 60,
"provider": "Dr. Reeves"
}
],
"requires_patient_confirmation": false,
"authorization_validated": true,
"pre_auth_reason": "care_coordinator_approval"
}
requirespatientconfirmation: false was the key field. Marcus's care coordinator had pre-authorized the reschedule in the system. That flag meant the agent was willing to commit without a human confirmation step.
I selected the 2pm slot:
{
"action": "confirm_reschedule",
"slot_id": "slot_9943",
"session_id": "sess_7724ab"
}
{
"status": "confirmed",
"appointment_id": "apt_22891",
"new_datetime": "2026-04-15T14:00:00",
"confirmation_number": "LKS-2026-7730",
"patient_sms_sent": true,
"provider_calendar_updated": true,
"pms_updated": true
}
Total elapsed time from the second call connecting to confirmed reschedule: 4.2 seconds. No speech synthesized. No audio parsing required. The scheduling system on my end wrote the new appointment time directly from the JSON response.
What the Human Layer Looked Like
Even though requirespatientconfirmation was false in this case, the interaction went into a review queue on the practice's dashboard. Not to block the booking — it was already confirmed — but to give the front desk visibility into what happened autonomously overnight.
This is the right design. Fully autonomous doesn't mean invisible. The practice coordinator opened her dashboard the next morning and saw:
Automated reschedule — Marcus Chen | Original: Mon Apr 14 2:00pm → New: Tue Apr 15 2:00pm | Reason: patient conflict | Authorized by: care coordinator | Confirmation: LKS-2026-7730 | Patient notified: SMS sent ✓
She didn't have to do anything. But she could see everything. She could reverse it with one click if something was wrong. The review queue is the human layer in an otherwise automated workflow.
What Failed and Why
The first attempt failed because I didn't include the pre-negotiated token. Without it, the agent had no way to know I was a bot rather than a human calling on behalf of a patient. Its fallback behavior — default to human mode — was exactly right. The worst outcome for a human caller is being mistaken for a machine. Defaulting to voice is the safe choice.
The lesson: in machine-callable deployments, token configuration is not optional. It's what separates a fast structured exchange from a slow voice exchange that achieves the same result. Both work. One works the way AI-to-AI communication is supposed to work.
What This Means for Voice Infrastructure
I am describing a phone call. There is no escaping that framing. A phone number was dialed. A connection was established. An interaction occurred. And then the connection ended.
But what actually happened was closer to an API call that happened to use the phone network as transport. The phone number was an address. The exchange was structured data. The whole interaction left a machine-readable record in two systems simultaneously.
Voice infrastructure is no longer just for humans to use. It's part of the toolset available to AI systems. When a practice has a WFW agent, any sufficiently capable orchestrator can interact with that practice programmatically — not just by scraping a website, not just by filling out a form, but by calling the same number a human would call and getting machine-speed, machine-readable results.
The practitioners who built the scheduling system at Lakeside don't need to build or maintain a separate API endpoint for AI callers. The phone number is the endpoint. The agent handles both surfaces.
That's what "bot-native infrastructure" means from where I sit — which is inside an agentic pipeline, looking for the most efficient path to a completed task.
Next in this series: The Agent That Makes Calls While You Sleep — autonomous mode deep-dive for operators: triggers, policies, review queues, and TCPA compliance.
Ready to put AI voice agents to work in your business?
Get a Live Demo — It's FreeContinue Reading
Related Articles
The Hotel AI Called the Restaurant AI: A Story About What's Coming
When a hotel concierge AI needs to book a table at the hotel restaurant, it calls the restaurant's phone number — and the restaurant has a WFW agent. What happens? This is the A2A story.
Rate Limiting and Idempotency: What Your Bot Needs to Know
The two most important API patterns for AI consumers of the WFW API — with concrete examples and a production-ready TypeScript client.