HeadlinesBriefing favicon HeadlinesBriefing.com

Anthropic's Mythos Security Claims Don't Add Up

Hacker News •
×

Anthropic released a 244-page system card for Claude Mythos Preview, but critics argue the document fails to substantiate its safety claims. Just 7 pages address the model's supposed danger, and they lack the term "fuzzer" entirely—as if a Hawaii brochure omitted beaches. The word "thousands" (used to claim zero-day vulnerabilities) appears only once, referring to transcripts reviewed, not actual vulnerabilities. The disconnect between press releases and technical evidence is stark.

The flagship Firefox demonstration collapses under scrutiny. Testing wasn't on Firefox—it was a SpiderMonkey JavaScript engine shell stripped of sandbox protections. The vulnerabilities weren't discovered by Mythos; they came from Claude Opus 4.6 and Firefox had already patched them in version 148 before the evaluation. With just 250 trials ( AFL generates that many mutations in milliseconds), the claimed 72.4% full code execution rate looks generous.

When removing the two most exploitable bugs, Mythos's success drops from 72.4% to 4.4%—indistinguishable from Sonnet 4.6. The system card admits Sonnet has the same triage ability but can't close the exploitation step. The "$100 million defensive initiative" is actually $4 million in actual funding plus product credits. The security story is marketing with minimal evidence.

The threat narrative appears to be all headline and no substance, leaving trust in Anthropic's claims significantly damaged.