The speed is the headline. The lesson is everything underneath it. The difference between a script and an agent, and where AI actually pays for itself in the revenue cycle.
The headline number is real. A workflow that took our partner’s eligibility team a median of four minutes per case now takes eleven seconds. Same volume. Higher accuracy. If the only thing you take from this piece is the speed number, fine; it is a number worth taking. If you stop at the speed number, you will misread what the project actually showed us, and you will deploy your next agent against the wrong problem.
Speed was not the lesson. The lesson is in the shape of the workflow we replaced, the shape of the workflow we built, and the five things we learned in the gap.
A note on terminology before going further. We did not build a script. We did not build an RPA bot. We built an agent: a system that decides at runtime which sources to query, reasons over the partial and sometimes contradictory data those sources return, recovers from the failures and timeouts that real-world payer portals produce, and writes a structured result with a confidence score. That distinction is not branding. It is the reason the project worked, and it is the reason previous attempts at this same workflow, the ones built as RPA, chronically broke.
The original 4-minute workflow
It is worth walking through what the human actually did, because the cost was not where most observers assumed it was. An eligibility specialist would receive a referral or appointment notification. They would log into the clearinghouse to pull a 270/271 transaction. They would log into the payer’s web portal to confirm benefit details the 271 did not include: copays, deductibles applied, prior auth requirements for the specific procedure. They would log into the practice’s internal eligibility cache to check whether the member had been seen before. For commercial plans with non-standard benefit structures they would frequently make a phone call. They would copy and paste the resulting fields into the EHR or scheduling system.
The median time was four minutes. The variance was enormous. A clean case (established member, common payer, standard procedure) could finish in well under a minute. A messy case (new member, secondary payer in play, nonstandard benefit) could run fifteen minutes or more. Across a high-volume practice or a payer’s eligibility operation, those variances aggregate into a workforce planning problem that no amount of headcount actually solves.
If you ask the specialists what was hard about the work, almost none of them will say the lookups are slow. They will say the cognitive cost of toggling. The screen-switching. The remembering which portal you were on. The losing your place when an interruption came in. That answer is the entire thesis of this piece.
The 11-second agent flow
The agent we built does not look like a faster version of the human workflow. It looks like a workflow that no longer needs to be a workflow.
On referral creation or appointment scheduling, the system fires automatically. A single agent reasoning loop reaches into the clearinghouse, the payer portal, the practice’s internal eligibility cache, and the member’s prior visit history in parallel. It composes a structured eligibility packet: coverage status, plan, copay, deductible status, prior auth requirements for the scheduled procedure, secondary coverage if any, and member responsibility estimate. It attaches a confidence score. It writes the result back to the EHR field where the eligibility specialist would have pasted it. End to end: eleven seconds.
The vast majority of cases come back with high confidence and pass straight through. The remainder, typically because of a payer portal that timed out, an ambiguous secondary coverage situation, or a benefit detail the agent could not resolve, get routed to a human. When the human gets the case, they get it as a structured packet. They are not starting a fresh fire drill. They are reviewing what the agent already gathered, deciding the one or two ambiguous fields, and sending it back through.
Takeaway 1: The bottleneck wasn’t speed. It was state-switching.
If you measured the human workflow with a stopwatch on each individual lookup, the lookups themselves were not slow. The clearinghouse responds in seconds. The payer portal responds in seconds. The phone call, when it happened, was the only step that consumed real time. What killed throughput was the cognitive cost of moving between systems. Every portal switch was a context switch. Every context switch had a tax: some seconds of re-orientation, some risk of losing place, some real fatigue accumulating over a shift.
This matters because most automation programs go after the wrong target. They try to speed up individual lookups by integrating directly with one or two sources, leaving the human to stitch together the rest. The state-switching cost is unchanged. The throughput gain is modest. The build is expensive. Going after the state-switching cost, by collapsing the workflow into a single reasoning loop that does not need to hand control back to a human between steps, is the move that actually compresses the cycle.
Takeaway 2: Agents are workflow collapse, not task acceleration.
Do not think of an agent as a faster human. Think of it as a workflow that no longer has to be a workflow. The reason the distinction matters: if you frame an agent as a faster human, you will design it to do the human steps in sequence. You will keep the same intermediate hand-offs, the same structured pause points, the same screens. You will get a modest speed gain and stop there.
When you frame an agent as workflow collapse, you ask different questions. Which of these intermediate steps existed only because the human needed to pause and think? Which existed only because the data had to be moved between two systems neither of which could see the other? Which existed only because the original designer assumed a person had to be in the loop? The answers shrink the workflow. In our case, what looked like an eight-step process became a one-step process with a confidence threshold.
This is also where the difference between an agent and an RPA bot becomes practically important. An RPA bot replicates the human steps and breaks when a portal updates its UI or returns an unexpected error. An agent reasons over what it has, decides what is missing, and chooses what to do next. RPA scales linearly with engineering investment. Agents handle variance. For a workflow like eligibility, where every payer has a slightly different portal and the portals change without notice, the difference is not academic.
Takeaway 3: Confidence scoring beats 100% automation.
We could push the auto-resolve rate higher. We have chosen not to. The reason: the marginal cost of an incorrect eligibility result downstream (a denied claim, a member billed unexpectedly, a scheduling conflict) is meaningfully higher than the marginal cost of routing an ambiguous case to a human.
The right design principle in agentic automation is rarely full automation. It is calibrated automation: a confidence score on every output, a written rationale showing the evidence the score is based on, and a routing rule that gets the gray-zone cases to a human with all the context the agent already gathered. Teams that try to push every case through the auto path will rediscover, painfully, why the gray zone exists. Teams that build a tight handoff between confident-auto and ambiguous-routed will compound the gains.
Takeaway 4: The ROI is downstream, not upstream.
Here is the number we did not lead with, because it matters more than the speed number and we wanted you to see it once you were already paying attention. The agent’s downstream effect on first-pass payment rate, denial rate, and rework volume is where the project paid for itself.
CAQH’s annual Index has documented for years that eligibility and benefit verification is one of the highest-volume administrative transactions in healthcare, and that the cost gap between fully electronic and partially manual transactions is large. The savings show up not in the seconds you cut from a single lookup, but in the cleaner downstream claim. Eligibility errors cascade. A misread copay becomes a member billing dispute. A missed prior auth requirement becomes a denied claim, an appeal, and a rework cycle that takes weeks. A missed secondary coverage becomes a write-off. The 11-second cycle time is nice. The cleaner downstream claims are the line item that closed the business case.
If you are evaluating where to deploy agents, do not stop at the time-saved number on the upstream task. Trace the consequences of that task’s errors all the way through the revenue cycle. That is where the real ROI sits.
Takeaway 5: The human role evolves to exception handler and supervisor.
The eligibility specialists did not disappear. Their job changed. Their day is no longer a queue of repetitive lookups. It is a queue of the cases the agent flagged as ambiguous, plus a supervisory loop where they audit a sample of the agent’s confident outputs to catch drift, plus a feedback loop where their corrections become training signal for the next iteration of the agent.
This shift is not optional, and it is not free. It requires a real reskilling investment. Specialists who were measured on cases-per-hour are now measured on exception-resolution quality and audit accuracy. The compensation framework, the management structure, and the team’s daily ritual all have to change with the workflow. Teams that pretend an agent deployment is just a productivity tool, and leave the human roles unchanged, usually find that the agent’s accuracy degrades over time, because no one is closing the feedback loop.
How to find your next 4-minute-to-11-second candidate
This pattern is not eligibility-specific. It applies to any workflow with a particular shape: high frequency, multi-portal, deterministic-with-edge-cases, and downstream-consequential. Pre-authorization status checks. Coordination of benefits. Claims status follow-up. Demographic verification. Provider directory updates. All of them have the same structural pattern as the eligibility check we replaced.
A useful shortlist test: does the workflow involve more than three portal logins? Is it run many times per day? Do its errors show up downstream as rework or revenue leakage? If the answers are yes, yes, and yes, the workflow is a candidate. The order in which you tackle candidates should be a function of frequency, variance, and downstream cost, not just frequency. The highest-volume workflow is almost never the highest-leverage one.
What this project actually showed us
The 4-minute to 11-second number is the headline because it has to be. People remember numbers. The lesson is everything else: that bottlenecks in administrative healthcare are usually about state-switching, that agents are workflow collapse rather than task acceleration, that confidence scoring beats full automation, that the ROI is downstream, and that the human role has to be redesigned in lockstep with the agent.
Most teams will read this and remember the headline. The teams that win the next phase of healthcare automation will read this and remember the takeaways.