The Workforce Wave AI Workflow Pattern: From Business URL to Production Voice Agent in 90 Seconds
The pitch for Workforce Wave is simple: give us a business URL and we'll provision a production-ready voice agent. The implementation is less simple. Under the hood, "90 seconds" covers a web crawl, parallel entity extraction, LLM-based prompt generation, knowledge base document creation, and voice agent configuration — all as a typed pipeline where each step hands off structured data to the next.
This post is about the pipeline architecture: why we built it the way we did, how the steps compose, and what the async operation handle looks like from the client's perspective.
The Pipeline
Workforce Wave runs these steps in order:
business_url
→ [Crawl] scrape multiple pages in parallel
→ [Classify] categorize page content by type (home, services, about, reviews)
→ [Extract] run entity extraction on classified content in parallel
→ [Generate] build system prompt + knowledge base documents from entity_data
→ [Configure] create agent record + attach KB documents + provision phone number
→ { agentId, systemPrompt, firstMessage, kbDocuments[], entityData }
Each step produces a typed output that the next step consumes. If a step fails, we capture what completed and return partial results — the pipeline is designed to degrade gracefully rather than fail completely.
The TypeScript Pipeline Types
// lib/scout/types.ts
/** Raw content from a crawled page */
export interface CrawledPage {
url: string;
title: string;
textContent: string; // cleaned, de-tagged text
contentType: PageContentType;
crawledAt: string;
}
export type PageContentType =
| "home"
| "services"
| "about"
| "contact"
| "reviews"
| "faq"
| "other";
/**
* Structured entity data extracted from the business website.
* This is the core output of the extraction step —
* everything downstream uses this.
*/
export interface EntityData {
businessName: string;
primaryService: string; // e.g. "dental practice", "HVAC services"
location: {
city: string;
state: string;
address?: string;
serviceArea?: string;
};
phone?: string;
hours?: BusinessHours;
services: ServiceItem[];
staff?: StaffMember[];
uniqueValueProps: string[]; // what makes this business distinct
reviewHighlights?: string[]; // key themes from customer reviews
policies?: string[]; // cancellation, payment, insurance policies
}
/**
* The complete Scout pipeline result.
* All fields are optional — if a step failed, its output may be absent.
* Consumers should handle partial results.
*/
export interface ScoutPipelineResult {
operationId: string;
entityData?: EntityData;
systemPrompt?: string;
firstMessage?: string;
kbDocuments?: KBDocument[];
agentId?: string;
completedSteps: ScoutStep[];
failedStep?: ScoutStep;
failureReason?: string;
durationMs: number;
}
export type ScoutStep =
| "crawl"
| "classify"
| "extract"
| "generate"
| "configure";
Why Not a Monolith
The temptation was to write Workforce Wave as a single async function: crawl, extract, generate, configure, done. We prototyped it that way.
The problem appeared immediately in failure handling. If the LLM call in the "generate" step fails, we've already done the expensive crawl and entity extraction. In the monolith, all of that work is lost — the outer try/catch catches the failure and the caller gets nothing.
More importantly, the intermediate outputs are actually valuable in isolation. entityData is useful to callers even if prompt generation fails — they can use it to write a system prompt manually, or trigger a regeneration with different parameters. Losing it on a generation failure wastes both the crawl cost and the extracted data.
The pipeline pattern solves this. Each step:
- Receives the output of the previous step
- Returns its own typed output (or throws with a
failedStepmarker) - Has its output preserved before the next step runs
// lib/scout/pipeline.ts
/**
* Run the Scout pipeline for a given business URL.
* Each step is independent — partial results are preserved and returned
* even if a later step fails.
*/
export async function runScoutPipeline(
businessUrl: string,
operationId: string
): Promise<ScoutPipelineResult> {
const startedAt = Date.now();
const result: ScoutPipelineResult = {
operationId,
completedSteps: [],
durationMs: 0,
};
try {
// Step 1: Crawl — scrape multiple pages in parallel
const pages = await crawlBusiness(businessUrl);
result.completedSteps.push("crawl");
// Step 2: Classify page content types
const classified = classifyPages(pages);
result.completedSteps.push("classify");
// Step 3: Extract entity data — run in parallel per content type
// Services pages and about pages are extracted concurrently
const entityData = await extractEntities(classified);
result.entityData = entityData; // preserve before next step
result.completedSteps.push("extract");
// Step 4: Generate system prompt and KB documents
// If this fails, entityData is already preserved above
const generated = await generateAgentContent(entityData);
result.systemPrompt = generated.systemPrompt;
result.firstMessage = generated.firstMessage;
result.kbDocuments = generated.kbDocuments;
result.completedSteps.push("generate");
// Step 5: Configure the agent in the database
const agentId = await configureAgent(operationId, generated, entityData);
result.agentId = agentId;
result.completedSteps.push("configure");
} catch (err) {
// Capture which step failed — result still contains all completed step outputs
result.failedStep = getFailedStep(err);
result.failureReason = err instanceof Error ? err.message : String(err);
}
result.durationMs = Date.now() - startedAt;
return result;
}
The LLM Call Pattern: Structured Output
Workforce Wave's entity extraction and prompt generation both use LLM structured output — we request a JSON Schema response format and get typed data back, not free text to parse.
// lib/scout/extract-entities.ts
import OpenAI from "openai";
const openai = new OpenAI();
/**
* Extract structured entity data from classified page content.
* Uses JSON Schema response_format to get typed data directly —
* avoids regex-based parsing of free-form LLM output.
*/
export async function extractEntities(pages: ClassifiedPage[]): Promise<EntityData> {
// Combine relevant page content into a single extraction prompt
const relevantContent = pages
.filter(p => ["home", "services", "about", "contact"].includes(p.contentType))
.map(p => `[${p.contentType.toUpperCase()}]\n${p.textContent}`)
.join("\n\n---\n\n")
.slice(0, 12000); // Token budget management
const response = await openai.chat.completions.create({
model: "gpt-4.1",
messages: [
{
role: "system",
content: "Extract structured business information from the provided website content. Return only the JSON object matching the schema — no explanation.",
},
{
role: "user",
content: relevantContent,
},
],
// Structured output: the response will always match this schema
response_format: {
type: "json_schema",
json_schema: {
name: "EntityData",
strict: true,
schema: ENTITY_DATA_JSON_SCHEMA, // matches the EntityData TypeScript type
},
},
temperature: 0.1, // Low temperature for extraction tasks — we want consistency
});
const content = response.choices[0].message.content;
if (!content) throw new Error("Empty response from entity extraction LLM");
// JSON.parse is safe here because response_format: json_schema guarantees validity
return JSON.parse(content) as EntityData;
}
Using responseformat: jsonschema with strict: true means we never write code to parse or validate the LLM output shape. The model either returns valid JSON matching the schema, or the API throws. This eliminates a whole class of subtle extraction bugs.
The Three KB Documents
Workforce Wave generates three knowledge base documents for every agent:
Primary KB — The core operational document. Services offered, hours, location, staff bios, pricing tiers (if extractable). This is what the agent references for most customer questions.
FAQ — Generated from two sources: common questions we infer from the services list ("Do you accept insurance?", "How long does a cleaning take?") and questions extracted from review text. Reviews often contain implicit FAQs — "They fixed my HVAC same day" implies "Do you offer same-day service?"
Compliance — Policies, disclaimers, and legal language. Cancellation policies, payment terms, accessibility information. We keep this separate so it's easy to update without touching the operational content, and so the agent can prioritize it appropriately (compliance text should be recited verbatim, not paraphrased).
Template Variables in System Prompts
The generated system prompt uses template variables that get injected from entityData at agent configuration time:
You are a voice receptionist for {{business_name}}, a {{primary_service}}
located in {{location.city}}, {{location.state}}. Your role is to answer
questions, schedule appointments, and provide information about our services.
{{business_name}} specializes in: {{services_list}}
Business hours: {{hours_summary}}
Always be professional, warm, and concise. For complex questions you cannot
answer, offer to take a message for the team.
The template approach means the same prompt structure works across all business types. The variables get resolved once at agent creation time — the resolved prompt is what gets stored and sent to ElevenLabs. No runtime variable substitution during calls.
The Async Operation Handle
Workforce Wave takes 60-120 seconds to run. That's too long for a synchronous HTTP response — most clients timeout at 30 seconds, and holding a connection open for 90 seconds is wasteful.
POST /v2/agents with a business_url returns immediately with 202 Accepted:
{
"data": {
"operationId": "op_a1b2c3d4",
"status": "pending",
"statusUrl": "/v2/operations/op_a1b2c3d4",
"estimatedDurationSeconds": 90
},
"meta": {
"request_id": "req_xyz",
"timestamp_utc": "2026-05-05T14:00:00.000Z"
}
}
The client has two options to wait for completion: poll the statusUrl, or listen for the agent.provisioned webhook.
# Poll pattern
curl -X POST https://api.workforcewave.com/v2/agents \
-H "Authorization: Bearer $WFW_API_KEY" \
-H "Content-Type: application/json" \
-d '{"business_url": "https://ridgelinedental.com"}'
# Response: 202 with operationId: "op_a1b2c3d4"
# Poll until complete
curl https://api.workforcewave.com/v2/operations/op_a1b2c3d4 \
-H "Authorization: Bearer $WFW_API_KEY"
# Response when done:
# {
# "data": {
# "operationId": "op_a1b2c3d4",
# "status": "completed",
# "result": { "agentId": "agt_xyz789", "entityData": { ... } },
# "completedAt": "2026-05-05T14:01:23.000Z"
# }
# }
Workforce Wave's pipeline failures are surfaced as operation failures with failedStep and failureReason in the result, so clients can handle partial success (entity data extracted but prompt generation failed) rather than treating everything as a binary pass/fail.
The 90-second number in the pitch is a p50. P95 is around 3 minutes for a complex multi-service business with many pages to crawl. The async handle design means that variance doesn't matter — clients poll or webhook either way.
Ready to put AI voice agents to work in your business?
Get a Live Demo — It's FreeContinue Reading
Related Articles
Workforce Wave AI: The Engine Behind Auto-Provisioning
What happens inside the 5-step Workforce Wave pipeline when a partner enters a business URL, why partners get an operationId instead of a 30-second wait, and how ww_operations powers the fleet dashboard progress bar.
Caller-Type Detection at 500ms: How to Tell a Human from an AI Mid-Call
When an inbound call arrives, your voice agent has under 500ms to decide whether it's talking to a human or another AI system — before generating a single word. Here's how WFW's dual-mode detection works.
The Last Manual System Prompt
Every voice AI deployment starts with writing a system prompt — hours of work that immediately starts to decay. Here's what we're automating, what stays human, and why the last prompt you write by hand is the one at setup.