HeadlinesBriefing favicon HeadlinesBriefing.com

AI Fails to Predict Study Replication, DARPA SCORE Project Finds

New York Times Top Stories •
×

A major DARPA-funded project called SCORE found that artificial intelligence cannot reliably predict whether scientific studies will hold up under replication. Led by researchers including Brian Nosek of the Center for Open Science, the team aimed to build a "credit score for science" but concluded AI accuracy remains insufficient. Only about half of replicated studies matched original results, underscoring the replication crisis.

The SCORE initiative involved 865 researchers who analyzed 3,900 papers from social sciences like psychology and economics. They replicated 164 studies, with original psychology results matching just 39 percent of the time. When multiple teams analyzed the same data using different methods, consistent results emerged only 57% of the time, and exact matches occurred one-third of the time. Data and code sharing issues further reduced reproducibility.

These findings reveal deep-seated problems in scientific validation, where journals