April 2026

An SLA framework for AI-assisted support triage

When we shipped the AI triage system into our IT support organization in 2025, the first surprise was how quickly the response SLA stopped meaning anything useful.

Before triage, our first response SLA was a real measurement of human attention. A ticket sat in a queue, and the time until someone acknowledged it told you something about the team's load and about the customer's experience. After triage, the bot acknowledges every ticket within seconds. The SLA is now trivially met on every ticket regardless of whether the work behind it ever moves. The number went green. The work did not get faster.

This is the first lesson of AI-assisted support. When the responder is automated, response time as a contract becomes ceremonial. The contract has to move to the unit of work that still requires human judgment, which is usually triage accuracy and the time until a human actually starts working the ticket.

Our framework rebuilds the SLA contract around six priority tiers rather than the typical four. The expansion is deliberate. A four-tier system collapses too much variance into the middle bands, and the pressure on triage staff to call something P2 or P3 produces a chronic inflation problem. A six-tier system gives the bot enough resolution to handle the full range from routine requests to systemic incidents without compressing the middle, and it lets the analyst hold a ticket at a lower tier when the impact does not justify higher.

P1 is reserved for system-wide outage or revenue-impacting incident, with a thirty-minute time-to-engage from a human responder. P2 covers multiple-user impact or critical workflow blocker. P3 is single-user blocker. P4 is single user with a workaround. P5 is request or enhancement. P6 is information-only or duplicate. The tier definitions are written so the bot can apply them deterministically, which means each tier has a list of qualifying conditions rather than a vague impact statement.

The framework only works if the intake form is redesigned to feed it. We replaced the priority dropdown with a set of structured impact questions covering who is affected, whether a workaround exists, whether revenue or compliance is at stake, and whether the system is fully down. The bot derives the priority from the answers rather than from a user selection. This shift mattered more than the model choice. Users systematically over-called priority when given a dropdown, and the dropdown was the largest single source of triage error in our pre-AI baseline.

The framework matters beyond support because the same operating logic now applies to QA. The internal AI tooling that supports my analysts during ticket review uses the same principle. The workflow is designed before the model is selected, and human judgment is reserved for what actually needs it. The QA review cycle compressed from seven days to two on the back of that discipline. The model is not what made the difference. The redesign of the workflow around the model is what made the difference.

What we still get wrong on the support side: the framework treats every ticket as a singleton. It does not yet understand that twelve P3 tickets opened against the same module within an hour are a P1 in disguise. The current iteration handles this through a human escalation path, which works but should not stay manual. The next iteration needs a clustering layer that promotes ticket priority based on volume and pattern rather than on the individual ticket's content. The model exists. The harder work is defining the tier transition rules in a way the operations team trusts.

The other gap: the framework was designed before agentic resolution was viable in our environment. The current system triages and routes. It does not resolve. The next version of the contract has to account for tickets the bot can close without a human, which means the SLA needs new measurements that distinguish triage time from resolution time, with a separate window for cases where the bot resolves without a human entirely.

The framework is a working document. It is published and reviewed quarterly against operational data. As the boundary between bot work and human work moves, the SLA contract has to move with it.