Plans have automated the front door and the mailbox. The work in between, where cost and clinical risk actually live, is still mostly manual. Here is what a complete sequence looks like, why language models and agents are what finally make the middle tractable, and where most automation programs stop short.
Walk into any health plan operations review and you will hear a familiar line: we have automated prior authorization. Sometimes the speaker means a new electronic intake portal. Sometimes a clearinghouse integration. Sometimes a vendor pilot. Almost always, when you sit down and map the actual flow of a prior auth from order to determination, two things become obvious. The first step is automated. The last step is automated. The middle is a relay race of human reviewers, faxes, EHR screen-scrapes, policy PDFs, and email threads.
This is not a controversial observation in the rooms where the work actually happens. Utilization management leaders and provider revenue cycle counterparts both know it. The American Medical Association’s annual Prior Authorization Physician Survey has documented for years that physician practices spend a meaningful share of every week on PA work, and that a large fraction of those requests delay or alter patient care. What is missing from the industry conversation is not awareness of the burden but a complete map of the sequence: a shared language for which step is which, and where automation is genuinely doing work versus where it is just decorating the edges.
We built that map. It has ten steps. Most plans have automated two of them. The six in the middle are where the next phase of the work has to happen, and they are the steps most automation programs skip. They are also the steps that, for the first time, are tractable, because the technology mix has changed. Rules engines and EDI plumbing have carried the front and back of the sequence for two decades. The middle requires reading policy prose, reasoning over clinical narrative, and reconciling evidence across half a dozen source systems. That work is what large language models and agentic AI are now competent at. The rest of this piece walks through the full sequence, names the six skipped steps, and offers a 90-day diagnostic any plan can run against itself.
The full prior authorization sequence
A prior authorization, end to end, is a ten-step workflow. The steps are not exotic. What is uncommon is seeing them named together.
- Intent capture. The provider triggers a prior auth at the point of order, ideally inside the EHR, at the moment the clinical decision is made.
- Pre-submission readiness check. Before the request is sent, the system confirms an auth is required for this code, this plan, this member, and that the minimum required documentation is attached.
- Document retrieval and synthesis. The clinical evidence (chart notes, labs, imaging reports, prior treatments, care plans) is pulled and structured into a packet.
- Policy and medical necessity reasoning. The case is matched to the right plan-specific policy, and the criteria are applied with a documented rationale.
- Decision routing and auto-adjudication. Cases that clearly meet criteria are auto-approved. Cases that clearly fall outside are flagged. Cases in the gray zone are routed to a clinician.
- Clinician-in-the-loop review with decision support. Reviewing physicians or nurses see the case, the evidence, and the policy delta in a single decision pane.
- Determination, communication, and appeals readiness. The determination is issued, the rationale is captured in language that holds up on appeal, and both provider and member are notified.
- Closed-loop feedback and continuous learning. Every reversal, peer-to-peer override, denial outcome, and provider abrasion event becomes a training signal.
- Status transparency. Providers and members can see where a request stands without picking up the phone.
- Audit trail and regulatory reporting. Every decision is reconstructable, every metric reportable, and CMS Interoperability and Prior Authorization Final Rule (CMS-0057-F) obligations are met by default.
The CMS rule has accelerated investment across the industry. Most of that investment has flowed into steps one, nine, and ten, the parts of the sequence the rule explicitly touches. Step one because intake had to become electronic. Steps nine and ten because providers and CMS now have to be able to see status and audit decisions. The pipes are getting built. Pipes are not decisions.
The six steps most plans skip
Steps two through seven are the middle of the sequence. They are also where almost every plan we have looked at has gaps. Six skipped steps, in order.
1. Pre-submission readiness check
What it is: a check that runs before the request leaves the provider’s system, confirming an auth is even required, that the right CPT and ICD codes are attached, and that the minimum documentation set is present.
Why plans skip it: because it implicitly asks the plan to do work the plan believes is the provider’s responsibility.
Why it now becomes tractable: a language model can read the clinical narrative the provider has already written and check it against the plan’s documentation requirements in real time, before the request is ever submitted. The longstanding complaint that a PA is incomplete is almost always solvable with information the EHR already has. The blocker was never data availability. It was the absence of a reader that could turn unstructured clinical prose into a structured completeness check.
What it costs to skip: incomplete-on-first-pass requests are the single biggest source of pended PAs reported by both payer UM teams and provider RCM teams. Each one becomes a phone call, a fax, and a delay in care. The provider blames the plan. The plan blames the provider. The patient waits.
2. Document retrieval and synthesis
What it is: pulling the actual clinical evidence (not just what the provider attached, but what is needed to make a decision) and structuring it into a reviewable packet.
Why plans skip it: because most plan IT lives outside the clinical record, and pulling notes, labs, and imaging reports across hundreds of EHR instances is genuinely hard.
Why it now becomes tractable: retrieval pipelines combined with a model that can read unstructured chart notes and extract the specific findings a policy requires, such as a documented BMI, a failed conservative therapy, or a specific imaging finding, finally turn the EHR from a wall into a queryable source. With the FHIR APIs mandated by CMS-0057-F coming online, the retrieval surface is expanding. The reasoning surface is what AI brings to the other side of it.
What it costs to skip: in the reviews we have walked through with UM teams, clinician reviewers spend more of their case time hunting for documents than evaluating them. That is the single largest unit-cost lever in the sequence, and most plans have left it on the floor.
3. Policy and medical necessity reasoning
What it is: matching the case to the right plan-specific policy and applying the criteria with the same logic, every time, with a written rationale attached.
Why plans skip it: because policies live in PDFs, criteria are written in clinical prose, and the prevailing assumption has been that clinical judgment cannot be encoded. The first half of that assumption is true. The second half is no longer true.
Why it now becomes tractable: a model that can read a policy document, extract its decision criteria, and check those criteria against an evidence packet does the same comprehension task a reviewer does. The difference is that the model produces a written audit trail showing which evidence supported which criterion. Every determination ships with its own citations. That is what makes the output appealable, defensible, and improvable.
What it costs to skip: variability. The same case, sent to two different reviewers, lands on two different determinations more often than any plan would like to admit. Inter-rater agreement in utilization review is a quietly painful number. That variability is what fuels appeals and what fuels provider abrasion.
4. Decision routing and auto-adjudication
What it is: auto-approving the cases that clearly meet criteria, flagging the cases that clearly do not, and routing only the gray zone to a clinician.
Why plans skip it: because they are afraid of false-positive approvals and lack a confidence-scoring framework that would let them set a defensible threshold.
Why it now becomes tractable: modern AI systems can produce a calibrated confidence score alongside their determination, with the rationale and the supporting evidence both visible. The plan picks the threshold. The cases above it auto-approve, with audit. The cases below it route to a reviewer with the work already done. The right design principle is calibrated automation, not full automation.
What it costs to skip: in plans that have not invested here, a human is in the loop on cases where the determination was never in question, and the genuinely complex cases get the same thin slice of clinician attention as the easy ones. Clinical judgment is the scarcest resource in utilization management. Skipping this step misallocates it.
5. Clinician-in-the-loop review with decision support
What it is: when a case does need a human, giving the reviewer the case, the evidence, the policy criteria, and the policy delta in a single decision pane.
Why plans skip it: because automation gets bolted onto the front of the same legacy reviewer interface. Every previous wave of automation pushed work toward the reviewer. Almost none of it made the reviewer’s actual screen better.
Why it now becomes tractable: the same model that drafted the rationale can pre-compose the case summary the reviewer reads. The reviewer’s job collapses from gathering and synthesizing to confirming, correcting, or overriding a structured proposal. AI here is not replacing the clinician. It is removing the parts of the work that were never clinical judgment in the first place.
What it costs to skip: per-case review time barely moves. Reviewer burnout climbs. The plans most aggressive about automation language sometimes have the worst clinician retention on their UM team. That is a tell.
6. Closed-loop feedback and continuous learning
What it is: every overturn, every peer-to-peer reversal, every denial that comes back through appeals, every provider abrasion signal, captured, structured, and used to improve the next decision.
Why plans skip it: because the data lives in different systems and no single owner is accountable for the loop.
Why it now becomes tractable: when each determination is produced by a model with a structured rationale, the corrections produced by reviewers, the peer-to-peer reversals, and the appeals overturns become labeled training data on the policies and case shapes where the system is consistently wrong. The corrective signal is no longer trapped in disconnected systems. It flows back into the model that made the call.
What it costs to skip: the same friction patterns repeat year over year. The plan never gets smarter at the things it is consistently wrong about, because the signal that it is wrong never travels back to the policy team or the model that made the decision.
Six steps: Each one a discrete piece of work. Each one independently valuable. None of them, on their own, the thing a plan brags about in an annual report. Together, they are the difference between a plan that has electronic prior authorization and a plan that is actually automated.
What “automated end to end” looks like in practice
In a real, working sequence, the order placed in the EHR triggers a readiness check inside ten seconds. The relevant clinical documents are retrieved and structured automatically. The policy reasoning layer applies the criteria and produces a determination with a confidence score and a written rationale that cites the evidence. If the confidence is high and the criteria are clearly met, the auth is approved and the response lands back in the EHR before the provider has finished closing the encounter note. If the confidence is below threshold, a structured packet (case, evidence, policy delta, suggested rationale) is on the reviewer’s screen, and the reviewer’s job is no longer to gather information but to apply judgment to the small subset of cases where judgment is the binding constraint.
The key design principle is not full automation. It is calibrated automation, with a confidence score on every output and a written rationale that holds up on appeal. Plans that try to push every case through the auto-adjudication layer will rediscover, painfully, why clinical judgment exists. Plans that pair confident auto-decisions with a tight clinician-in-the-loop layer get the compounding gains.
A 90-day diagnostic
Three measurements, then a sequencing decision. First, map the actual workflow. Walk a real prior auth from order to determination. Note every place a human re-keys data, every fax, every screen-scrape, every email. The map will be uncomfortable. That is the point.
Second, measure three numbers in your own operation: the share of prior auths that are auto-decided without a human touch, the median touch time per case for the ones that do go to a human, and the share of denials overturned on appeal. The first two tell you where you are in the sequence. The third tells you whether the decisions you are making, automated or not, are holding up.
Third, sequence the work. Plans that try to fix the clinician-in-the-loop layer first, before fixing document retrieval, end up with a beautifully designed reviewer interface that still has nothing reliable to show the reviewer. The right order is roughly: readiness check, document retrieval, policy reasoning, then auto-adjudication, then the reviewer pane, with the feedback loop built across all of them. Skip the middle and the ends do not hold.
The next two years
Prior authorization is being automated, with or without any individual plan’s readiness. The CMS rule has set the floor. Provider organizations are tightening their own automation on the submission side, which means the plans that do not automate the middle will increasingly receive cleaner intake into a still-manual back office. That is the worst of all worlds: more volume, no leverage.
The plans that come out of this transition well will not be the ones with the most vendor logos on a slide. They will be the ones who treated the sequence as a sequence, who refused to skip the middle, and who used the new generation of language models and agents on the steps where rules engines never could. Healthcare automation has been a category waiting on a technology that could read. That technology now exists. The remaining question is how plans choose to use it.