Solution playbook · Customer support

Tier-1 support that escalates like a senior.

A complete implementation guide for shipping an AI agent that resolves the bulk of your support tickets, detects when to bring a human, and never breaks audit. Built from how teams actually take agents from PoC to production.

12 min read Industry fit: ecommerce · SaaS · marketplace Stack: Any CRM with API access

Talk to sales Start free

The challenge

Why traditional chatbots stall in tier-1.

Most chatbot deployments solve the easy 20% of tickets and dump the rest on a frustrated human. The problem isn't the model — it's the wiring around it.

Flow-based, not context-aware

Decision trees can't read sentiment, can't reorder steps, can't recover from off-script questions. The moment a customer breaks the path, the bot loops or bails.

No real escalation, just a handoff

Most bots pass the ticket without context. The human starts from zero, asks the same five questions, and the customer escalates again. The transfer is the failure point.

Tools, not policies

Calling Zendesk's create_ticket API is easy. Knowing when not to is hard. Without HITL guardrails on refunds, cancellations or PII, you ship a liability.

No audit trail anyone trusts

Logs in five places, no replay, no per-turn cost, no way to prove what the bot did or said. Compliance asks, and the team stalls.

The approach

Four moves that change tier-1 economics.

Thaliq wires four primitives behind one API: model routing, MCP gateway, HITL and observability. Customer support uses all four — here's how they map to your day-to-day.

Detect intent and sentiment

Every incoming message gets typed (intent classification) and scored (sentiment). The agent decides whether to resolve, ask, or escalate before it picks a tool.

Resolve within policy

Tools call your CRM, your refunds API, your knowledge base. Each call respects the user's permissions via token passthrough. The model never sees credentials.

Escalate with the full thread

When the agent hits a threshold (frustration, refund > X, missing data), it suspends the turn with a HITL card. The human sees what the agent tried, why it stopped, and the proposed action.

Audit every step

Each turn writes to LangFuse (debug), DynamoDB (tenant metrics) and Postgres (billing). Replay any conversation. Filter by tool failure, escalation cause, or model used.

Division of labor

What Thaliq handles. What you control.

Thaliq is infrastructure. The opinionated parts of your support — policies, voice, escalation rules — stay yours. Here's the line.

Thaliq handles

✓ Model routing across Claude / GPT / Llama / custom
✓ Streaming SSE with resumable HITL flows
✓ MCP gateway with typed tool access
✓ Per-tenant isolation and plan limits
✓ Audit logs, traces and per-turn metrics
✓ Multi-channel delivery (web, WhatsApp, Telegram, Slack)
✓ Versioning and rollback of agent specs

You control

◆ Tool inventory and their API contracts
◆ Refund thresholds, escalation triggers, brand voice
◆ Knowledge base content and retention
◆ Identity provider and RBAC
◆ Which channels you go live on and when
◆ Compliance scope (GDPR, HIPAA, SOC2) and DPA terms

Implementation

From discovery to GA in four weeks.

A typical deployment for a team with one engineer on point. Faster if you already have a CRM with API access; slower if you need to spec a custom MCP server.

Week 1 01

Discovery and mapping

Map your ticket distribution. Identify the top 5 ticket types by volume. Decide which ones the agent should fully resolve and which ones must escalate.

Ticket taxonomy Tool inventory Escalation policy v1
Week 2 02

Pilot configuration

Build the agent in Studio. Wire the first 2-3 tools (CRM read, knowledge base, refund check). Run a dry-run against last week's tickets to compare proposed vs actual resolutions.

Agent v1 in staging Tool integrations Dry-run report
Week 3 03

HITL tuning

Tune escalation thresholds with real conversations. Add refund guards, sentiment cutoffs, and form-based confirmations for risky actions. Catch the edge cases before they hit production.

HITL policies live Refund threshold set Edge case playbook
Week 4 04

Channel-by-channel rollout

Go live on web widget first. Monitor 48 hours. Add WhatsApp. Monitor 48 hours. Add the rest. Each channel inherits the same agent, same policies, same audit log.

GA on web widget WhatsApp + Telegram live Slack handoff to your CX team

Scenario walkthrough

A real ticket, end to end.

What a conversation looks like when the agent does its job — including the moment it correctly bails to a human.

conversation · agent-7711

channel: web

Customer

Hi, I returned the headphones two weeks ago and still haven't seen the refund on my card. Order #A-48291.

intent: refund_status · sentiment: neutral · channel: web

Inbound message is typed and scored before the agent picks a tool.

Agent

Let me check the status of order A-48291. Looking it up now.

tool: get_order(A-48291) → ok

The agent calls the CRM tool. Token passthrough means it queries with the customer's session, not a service account.

Agent

I can see the return was received on May 14, but the refund hasn't been issued yet. The amount is $189.40 — under our policy I can issue this now. Want me to proceed?

tool: get_refund_eligibility → ok · amount: $189.40

Within the auto-refund threshold ($200), but the agent still asks because policy says "confirm before any refund > $100".

Customer

Yes please, and also can you check on the second pair I ordered last Friday? It hasn't shipped.

Customer accepts and tacks on a second question — typical real-world ticket shape.

Agent

Refund of $189.40 issued — confirmation #R-7711. Now checking the second order.

tool: create_refund → ok · ref: R-7711

Agent

Order #A-48519 is showing as paid but our warehouse hasn't picked it. There's a flag on it I can't resolve from here — I'm going to bring in someone from the team.

escalation_trigger: warehouse_flag · context_packaged: true

Agent hits an escalation trigger (warehouse flag on a paid order). It packages the full thread + tool history and hands off.

Human · CX team

Hi, this is Sara from the team. I see the refund went through and order A-48519 has a hold from the warehouse — I'll get this cleared today and confirm by email.

The human gets the full audit trail in their inbox, not a blank ticket.

Outcomes

What teams typically see after 90 days.

Ranges, not promises. The variability comes from your ticket distribution, your CRM quality, and how aggressively you tune escalation thresholds.

60–75%

Auto-resolved without human

Across the top 5 ticket types

<30s

Median first response

From ticket receipt to first agent message

>90%

Escalation accuracy

Tickets routed to humans were correctly routed

100%

Audit coverage

Every turn, tool call and HITL decision logged

How we measure this is documented per metric in your tenant dashboard. We don't quote numbers we can't replay.

Side by side

What changes vs your current setup.

Comparing a typical flow-based chatbot deployment to a Thaliq agent in the same role. Numbers are illustrative — yours will vary.

Aspect Legacy chatbot Thaliq agent

Time to first production 8–12 weeks 2–4 weeks

Personalization User-segment branches Per-conversation context + RAG

Tool integration Custom code per tool HTTP tools + MCP gateway, declarative

Escalation handling Blind handoff Full thread + tool history in human's inbox

Audit trail Logs spread across systems Replay any turn in one place

Multi-language Separate flows per language Same agent, language detected per turn

Multi-channel One bot per channel One agent across all channels

Compliance posture Manually documented SOC2 / GDPR / DPA out of the box

Readiness check

Is your team ready for this?

Be honest with yourself. The teams that succeed with this playbook have most of the green answers below. If you have more yellows than greens, talk to us first — we'll either fix the gaps with you or recommend you wait a quarter.

Do you handle more than 100 tickets a day with a recognizable top-5 by volume?

fit

Strong fit — that's enough volume to justify the work and enough repetition for the agent to learn the shape of your tickets.

Does your CRM expose a documented HTTP API or have an MCP server?

fit

Strong fit — Thaliq plugs in directly. If your CRM is API-less, plan a Week 0 to spec a wrapper service.

Do you have a single owner for ticket policies (refunds, escalation, language)?

fit

Strong fit — agents amplify whatever policy you encode. A single owner means cleaner iteration.

Are you handling regulated data (health, financial, PII at scale)?

consider

Consider — fit, but plan the BAA / DPA conversation early. HIPAA is on request and requires an Enterprise plan.

Is your ticket volume below 50 a day?

consider

Consider — the ROI math gets thin. Start with our free tier (500 requests/mo) and measure before committing to Builder.

Are your current ticket categories undefined or constantly changing?

caution

Warning — agents need stable categories to learn. If you're mid-reorg on your CRM taxonomy, fix that first.

Ready to wire your tier-1?

We'll scope a pilot in one call. No deck, no boilerplate — just the shape of your tickets and what fits.

Talk to sales Start free

Other playbooks

Internal copilot

MCP · token passthrough · for ops teams

Embedded SaaS

Multi-tenant agents inside your product