HeadlinesBriefing favicon HeadlinesBriefing.com

OpenAI o1 beats ER doctors in diagnosis accuracy

Hacker News •
×

A Harvard investigation published in *Science* found that OpenAI’s reasoning model o1 identified the correct or near‑correct diagnosis in 67 % of 76 emergency‑room cases, eclipsing the 50‑55 % accuracy achieved by two triage physicians reviewing the same electronic records. The trial paired each AI assessment with a human duo, forcing a head‑to‑head comparison under identical data constraints.

When richer notes were supplied, o1’s accuracy climbed to 82 %, narrowing the gap with experts who reached 70‑79 %—a difference that failed statistical significance. In a separate task, the model drafted long‑term care plans for five case studies, scoring 89 % versus 34 % for physicians relying on conventional search tools.

About 20 % of U.S. physicians already tap AI for diagnostic aid, and a similar share in the U.K. uses it weekly, according to recent surveys. Researchers caution that the study excluded visual cues and patient demeanor, limiting claims of full clinical replacement. For now, the technology serves as a powerful second‑opinion assistant in emergency triage.

Hospital administrators see the results as justification to integrate LLMs into electronic‑health‑record workflows, hoping to reduce misdiagnosis costs that run into billions annually. Yet clinicians warn that without clear liability frameworks, reliance on algorithmic suggestions could expose providers to legal risk. The study therefore underscores the need for rigorous validation before widescale deployment.