Weekly Evidence Roundup · April 14, 2026

ChatGPT Agreed with Pharmacists 27.7% of the Time. That's Not the Whole Story.

What They Found A Dutch research team at OLVG Hospital put ChatGPT-4 head-to-head against a geriatric internist and hospital pharmacist in 51 medication reviews for older adults averaging more than ten medications each. The results were humbling — and complicated. ChatGPT matched the clinicians'

What They Found

A Dutch research team at OLVG Hospital put ChatGPT-4 head-to-head against a geriatric internist and hospital pharmacist in 51 medication reviews for older adults averaging more than ten medications each. The results were humbling — and complicated.

ChatGPT matched the clinicians’ interventions only 27.7% of the time. Healthcare professionals proposed 183 interventions; ChatGPT proposed 202. But the overlap was thin. Among ChatGPT’s suggestions, 84 were outright incorrect — interventions that an expert panel flagged as inappropriate for the patient’s clinical context. That’s a 33.7% error rate on its own recommendations.

But here’s the number that keeps this study from becoming a simple cautionary tale: ChatGPT also identified 19 clinically valid interventions — 7.6% of the total — that the human reviewers had missed entirely. These weren’t trivial catches. They were drug-related problems in complex polypharmacy regimens that two experienced clinicians overlooked.

The model was strongest at structuring patient data: 86.4% of diagnoses were correctly linked to medications. But it struggled with the harder cognitive task — contextualizing laboratory targets to the specific characteristics of each geriatric patient (only 46.7% correct).

Why It Matters Now

Health systems are under enormous pressure to scale medication review for an aging population. The average 75-year-old takes seven medications; those in the study averaged over ten. There aren’t enough clinical pharmacists to review every regimen at the depth it deserves.

The temptation is to hand the task to an LLM. This study says: not yet — but also, not never. The 27.7% agreement rate rules out autonomous AI medication review. But the 19 missed-by-humans findings suggest a complementary role that no one should ignore. The question isn’t “can AI replace the pharmacist?” — it’s “can the pharmacist afford to work without a second set of (imperfect) eyes?”

That distinction matters for every CMIO writing a deployment plan this quarter.

What CarePathIQ Is Building in Response

CarePathIQ’s pathway architecture is designed precisely for this middle ground — not autonomous AI decision-making, but structured clinical reasoning that surfaces what a single clinician might miss while keeping the human in the loop. Every CarePathIQ pathway includes branching logic, embedded calculators, and disposition closure points because the evidence keeps showing the same thing: AI augments clinical thinking best when the scaffold constrains the chaos.

We’re building the tools so the next generation of medication review isn’t AI or pharmacist. It’s both — with guardrails.

The AI models are ready. Are your systems?

Try the CarePathIQ AI Agent → carepathiq.org/ai-studio

Source: Ten Hoope SMK et al. “Conducting Medication Reviews: A Comparative Study Between ChatGPT-4 and Healthcare Professionals.” J Am Geriatr Soc 2026. DOI: 10.1111/jgs.70415

← All posts