For most of 2024 and 2025, the dominant frame for AI in healthcare was the copilot: AI suggested, a clinician acted. The framing was reassuring. It kept a human in every loop and let institutions defer the hard question of governance, because the AI never did anything a human hadn't already approved.
That framing is breaking down. At the APG Spring Meeting in San Diego this May, 4 organizations — an FQHC, a national VBC enablement company, a Southern California physician group, and UC San Diego Health — described deployments that have moved past copiloting into something more autonomous.
• • •
From in the loop to on the loop
Karandeep Singh, UC San Diego's Chief Health AI Officer, framed it as 3 generations of clinical AI, defined by where the human stands relative to the action:
- In the loop: the AI suggests, the clinician verifies, then acts.
- On the loop: the AI acts within defined bounds; the clinician reviews after the fact.
- Agentic: the AI acts, and the human sets guardrails and governs the system rather than each transaction.

As the human box shrinks, the AI's authority to act grows, until governance, not approval, is the human's job.
This isn't theoretical. UC San Diego's AI-led colonoscopy prep calls, launched in Dec. 2025, Îcompleted more than 4,000 calls by May. They produced an 84% net promoter score, with no-show rates down from 10-15% to roughly 6%.
Neighborhood Healthcare, an FQHC serving 110,000 patients on Medicaid margins, runs a homegrown portfolio. The work — chart review, lab-result messaging, mammography monitoring, malpractice-risk screening, a clinical-trial matching agent — wouldn't look out of place at an academic medical center.
On the agentic end, Lumeris' "Tom" triggers entire post-discharge workflows from an ADT notification, while an early Doctronic refill pilot reached 97% physician concordance on its renewal recommendations.
The point for executives isn't that AI can automate 1 task. It's that end-to-end clinical and operational processes are already being delegated to agents under real conditions — and that the lift is now within reach of organizations that could never afford the EMR-era version of digital transformation.
The iceberg problem
Those sanctioned deployments are the visible tip. Singh's argument, published in NEJM AI, is that governance frameworks built for discrete, procured tools break for general-purpose generative AI, where the harm profile depends entirely on how an individual clinician uses it. Two-thirds of physicians now use AI daily — most of it on personal accounts, with copied text and unclear data retention. Banning these tools without an enterprise alternative doesn't reduce risk; it drives use underground.

The riskiest AI in your organization isn't the well-governed tool you bought. It's the unsupervised one your clinicians already use.
The era of AI as a tool clinicians occasionally pick up is ending. The era of AI as a colleague clinicians govern is beginning.
What the leaders do differently
The organizations ahead on this share 3 things:
1. Real testing infrastructure. UC San Diego ran 500-plus test calls, a failure-mode analysis, and nurse-led red-teaming before go-live. It's the rigor most institutions reserve for new equipment or drugs.
2. Enterprise tools that pull usage above the waterline. Neighborhood's NH-GPT isn't trying to beat consumer models. It's aiming to be good enough that clinicians choose the sanctioned, auditable option.
3. Teams, not pilots. Successful organizations are creating permanent departments pairing engineers and data scientists with clinical, risk, and ethics leaders — not a project that ends.
Most regional systems, physician groups, and FQHCs have policies and committees, and maybe a vendor or two with AI in the EHR. They don't yet have the operating model to deploy and govern agentic AI at the speed it's arriving.
That gap is the central question for the next 2 years. Healthcare's last big transition, the EMR rollout, was poorly governed in ways the field is still recovering from. This time, the clinicians who lived through it can recognize the warning signs — thin testing, weak change management, training as an afterthought — and apply the lessons before the system is built, not after.
Lauren Patrick is President of Healthmonix, a healthcare analytics company focused on MIPS quality performance reporting, value-based care programs, and CMS compliance.
