Skip to content
Automation Transformation Consulting
Case Study7 min read

The AI Customer Support Agent Playbook (What Sierra, Decagon, and Klarna Got Right)

Klarna replaced 700 reps with one AI agent. Sierra and Decagon are powering some of the largest support orgs in tech. Here is the pattern they share — and how to apply it without the public stumbles others have had.

By James Perkins & Sean BoycePublished May 7, 2026

Quick answer: The companies that ship customer support AI agents successfully share a four-part pattern: start in deflection mode (handle simple cases, escalate everything else), measure aggressively with real eval sets, integrate context deeply (CRM, order history, account state), and design escalation paths that preserve trust when the agent can't help. Klarna's $40M annual savings, Sierra's enterprise rollouts, and Decagon's customer base all demonstrate this works. The companies that botched their AI support deployments (Air Canada, DPD, etc.) skipped one or more of these steps.

Pattern #1: Start in deflection mode

The successful pattern: deploy the AI agent for simple, high-volume cases first (password resets, order status, return initiation, hours of operation). Escalate everything else to a human. Measure deflection rate (resolved without human) and customer satisfaction in parallel.

Klarna's first wave handled 2.3M conversations and resolved most of them. The cases that escalated were the same cases that would have escalated to senior support reps anyway. The AI didn't replace senior support — it freed them from tier-1 work.

Don't: deploy the agent for high-stakes scenarios first (refund disputes, fraud claims, account closures). The downside on a wrong answer is reputational and financial.

Pattern #2: Measure aggressively with real eval sets

The companies that ship build a 200-1000 example eval set drawn from real production tickets. They score every prompt change against it. They regression-test every model upgrade. They surface failures publicly so the team can iterate.

The companies that fail rely on vibes ("the demo looked good") and customer complaints (a slow, lossy signal).

Don't: ship without an eval set. Even a 50-example set is better than none. The set has to come from real production tickets, not synthetic.

Pattern #3: Integrate context deeply

The killer feature of an enterprise support AI is account-aware context. The agent knows the customer's order history, current subscription, support history, and product entitlements. It can take action: refund within policy, modify order, escalate to billing.

This requires real integration with your CRM, billing, fulfillment, and product systems. It's the bulk of the work in deploying a support AI. Vendors who present "drop-in" agents without integration time are selling chatbots, not agents.

Don't: deploy a context-less chatbot at the edge of your stack and call it an "AI agent." Customers will spot the difference within one conversation.

Pattern #4: Design escalation that preserves trust

When the agent can't help, the escalation has to feel seamless. The handoff to a human should:

  • Bring the full conversation context (no "please explain again to a person")
  • Tell the customer who they're being transferred to and approximate wait time
  • Hand off to a human qualified for the issue, not a generic queue
  • Mark the conversation in your CRM so trends in escalations are visible

The companies that botched their AI deployments (Air Canada's hallucinated bereavement policy, DPD's swearing chatbot) had brittle escalation. The AI was put in front of customers without a graceful "I cannot help, here's a human who can" path.

The math that actually works

Klarna's reported numbers: AI handles ~700 FTE of support load, equivalent in CSAT to humans, 25% reduction in repeat inquiries, $40M expected profit improvement in 2024. They published this because the math is real. Other companies are quietly seeing similar results without the press release.

For mid-market companies (1-10K support tickets/month), the math is similar in shape but smaller in scale. A 70% deflection rate at $5/ticket cost-to-serve and $0.50/agent-resolved-ticket cost is a $4.50/ticket savings × 7,000 tickets = $31K/month or ~$370K/year. Against a $200K-$300K year-one TCO for the agent build, payback comes inside year one.

If you're scoping a customer support AI agent, our Automation Build service ships these end-to-end with the patterns above baked in. Or talk through your specific scoping in an hourly call first.

Frequently asked questions

What did Klarna's AI customer support agent actually do?

Klarna deployed an OpenAI-powered agent in February 2024 that handled 2.3M conversations in its first month, equivalent to 700 full-time agents. They reported equal customer satisfaction to humans, 25% reduction in repeat inquiries, and projected $40M annual profit improvement. The agent runs in 23 markets and 35+ languages.

How do Sierra and Decagon differ as AI support vendors?

Sierra (founded by Bret Taylor) targets enterprise deployments with deep integration and an SLA-grounded outcomes model. Decagon (YC-backed) is broader-market, with a focus on rapid deployment and strong analytics. Both are credible; the right choice depends on your scale, integration needs, and how much customization you require.

What's the typical deflection rate for an AI support agent?

Well-deployed agents handle 50-80% of inbound tickets without human escalation. Klarna's published rate was around 70-80% on routine cases. Higher-stakes industries (financial services, healthcare) often see 30-50% as appropriate given regulatory friction. Lower than 30% suggests the agent is misconfigured.

Why did some AI support deployments backfire publicly?

The botched deployments (Air Canada's hallucinated bereavement policy, DPD's swearing chatbot, NYC MyCity's illegal advice) shared common failure modes: no eval set, no escalation path, deployed for high-stakes scenarios first, lacking grounded context from real policies and account data. The pattern is preventable with the four discipline points outlined above.

How long does it take to deploy a customer support AI agent?

Vendor solutions (Sierra, Decagon, Fin) can be in production in 4-8 weeks if your stack is integration-friendly. Custom builds typically run 12-24 weeks for enterprise deployments. The integration work — connecting the agent to CRM, order history, account state — is the bulk of the timeline, not the AI itself.

Ready to ship an AI agent that actually works?

We embed with your team, build the agent, and ship it to production. Founder-led, no slide decks.