HeadlinesBriefing favicon HeadlinesBriefing.com

UK Biobank Data Leaks to GitHub - 110 Takedown Notices Filed

Hacker News •
×

UK Biobank, which holds genetic, health, and lifestyle data on 500,000 British volunteers, has filed 110 DMCA takedown notices to GitHub since July 2025. Researchers with access to the sensitive data keep accidentally uploading it to public repositories, exposing participant information worldwide. The tracker, built by Luc Rocher at Oxford Internet Institute, identifies 197 repositories targeted so far involving 170 developers across at least 14 countries.

The notices target developers primarily in the United States (24) and China (21), with others from the UK, Germany, Australia, and Spain. The Guardian demonstrated how easily participants can be re-identified using only approximate date of birth and the date of a single major surgery. Nearly half the exposed files are Jupyter or R notebooks containing data rows, while a quarter contain genetic and genomic data files including PLINK and BGEN formats.

The takedown requests mysteriously stopped between January and March 2026, then resumed after the Guardian's investigation revealed the ongoing exposure. Researchers writing for the BMJ argue that UK Biobank harms participants by downplaying re-identification risks while advising volunteers to limit their online presence. The institution must demonstrate greater humility and commitment to listening to privacy experts.