UltisAI

Custom AI agents are business-specific—not generic. A plumbing company's AI agent needs to identify emergencies; a salon's needs to match stylist availability; a cleaning service's needs to handle scope variation. This guide walks through building a custom AI agent tailored to your business workflows, integrations, and customer expectations.

The Build vs. Buy Decision

You have two paths: buy a pre-built agent and adapt it to your business, or build a custom agent from the ground up.

Buy (SaaS): Faster to launch (days), lower technical friction, but limited customization. You're constrained by the vendor's features and roadmap. Good for: businesses that fit the vendor's template.

Build (Custom): Slower to launch (weeks), but fully customized to your workflows. You own the agent logic, integrations, and user experience. Good for: businesses with unique processes or compliance needs.

Most service businesses should start with Buy, then move to Build once you understand what you need. This guide assumes you're ready to Build.

Phase 1: Map Your Business Logic

Before writing code, document your business flow:

Inbound call entry: What happens when a customer calls? Do you answer immediately, route to a queue, or transfer to a human?
Qualification questions: What do you need to know about each call? Write out 5-10 questions in priority order.
Decision logic: Based on answers, what action happens next? (Schedule appointment, transfer to human, send follow-up email, etc.)
Integrations: Where does call data go? CRM? Calendar? Billing system? Payment processor?
Escalation rules: When does the AI hand off to a human? (Angry tone, complex question, customer requests it, etc.)
Fallback behavior: If integration fails, what's the agent's behavior? (Leave voicemail? Send email? Pause and retry?)

Phase 2: Choose Your Tech Stack

Voice provider: Twilio (most flexible), Vonage, OpenAI Realtime API, or others. These handle inbound calls and give you access to the audio stream.

LLM: Claude, GPT-4, or open-source Llama. Larger models are smarter but costlier. Streaming is critical—latency matters on calls.

Orchestration: Write a backend service (Node.js, Python, Go) that wires together voice input → LLM → integrations → voice output. Use request libraries (axios, requests) to hit your integrations.

Storage: Postgres (call logs, customer data), Redis (cache for real-time state). Store full call transcripts and summaries for audit trails and training.

Monitoring: Sentry (error tracking), Datadog (performance), CloudWatch (logs). On live calls, latency above 1 second is noticeable; above 2 seconds is bad.

Phase 3: Build the Agent Loop

The core loop is simple:

Receive audio chunk from caller
Transcribe to text (STT)
Pass to LLM with context (conversation history, business rules, customer data)
Get text response from LLM
Synthesize to speech (TTS)
Stream back to caller
Log everything (transcript, sentiment, decision, integrations called)

Latency optimizations:

Stream responses (don't wait for full LLM output)
Parallelize TTS while LLM is thinking
Pre-cache common responses (greeting, hold message, etc.)
Use smaller models for simple tasks (classification, extraction)

Phase 4: Integrations

Connect your agent to business systems:

CRM (HubSpot, Salesforce): Create contact, log call, attach notes. Use webhooks to push data in real-time.

Calendar (Calendly, Cal.com, Google Calendar): Query availability, create booking, send confirmation. Use APIs to check real-time slots.

Billing (Stripe, Square): Generate payment link, tokenize card (never store raw card data), log transaction. Use webhooks for payment confirmation.

FSM (Jobber, ServiceTitan): Create job/lead, attach call transcript, set priority. Use API docs to understand required fields.

Retry logic: If integration fails, queue the action and retry exponentially. Don't let a failed Stripe call break the whole agent.

Phase 5: Testing and Iteration

Test scenarios: Happy path (customer books appointment), edge cases (customer angry, AI can't understand, integration fails), escalation (customer asks for human).

Metrics to track: Call duration, customer satisfaction (post-call survey), booking conversion rate, integration success rate, LLM token cost per call.

Iteration feedback loop: Review recorded calls weekly. Listen for: unclear agent responses, missed qualification opportunities, integration failures, false escalations. Refine the system prompt based on patterns.

A/B testing: Try different prompts, question orders, or integration flows on 10% of calls. Measure impact on conversion or satisfaction.

Common Pitfalls

Mistake 1: No fallback for integration failures. Your CRM API goes down, and the agent silently fails. Build queues and retries.

Mistake 2: Latency kills the call. 2+ second pauses sound like the agent froze. Optimize and measure every millisecond.

Mistake 3: No call logging. You can't improve what you don't measure. Log transcripts, decisions, integrations, outcomes.

Mistake 4: Treating all customers the same. A repeat customer should have access to their history; a first-time caller shouldn't be asked the same questions twice.

Mistake 5: No escalation strategy. Some calls are better handled by humans. Define clear escalation triggers.

Example: Plumbing Service

Business flow: Customers call → AI answers → qualifies emergency vs. routine → if emergency, transfers to on-call tech → if routine, books appointment → sends confirmation and payment link.

Tech stack: Twilio for voice, Claude for LLM, Node.js backend, Postgres for logs, Stripe for payments, Jobber API for job creation.

Core logic: System prompt tells Claude to ask: "Is this an emergency?" → If yes, "Are you currently at home?" + transfer. If no, ask service type, location, preferred time → Create booking in Jobber → Send SMS with confirmation and deposit link.

Fallback: If Jobber is down, queue the job in Redis and retry every 5 min. If Stripe link fails, send SMS without payment link and email it 30 min later.

The Bottom Line

Building a custom AI agent is 70% business logic (understanding your flow, integrations, edge cases) and 30% code. Start by mapping your business, choosing tools, building the core loop, wiring integrations, and iterating on real calls. The result: an agent that feels native to your business, not bolted on.

Build Custom AI Agents for Your Service Business