HeadlinesBriefing favicon HeadlinesBriefing.com

Testing Anthropic's Mythos for Vulnerability Research

Hacker News •
×

Cloudflare tested Anthropic's Mythos Preview on over 50 of their own repositories as part of Project Glasswing, and the results show a genuine leap in AI-driven security testing. Unlike general-purpose frontier models that identify individual bugs then stop, Mythos can chain multiple low-severity vulnerabilities into working exploits - combining primitives the way a human researcher would.

The model also generates actual proof-of-concept code, compiling and running it to confirm exploitability rather than leaving findings as speculation. When other frontier models were tested on the same code, they identified similar bugs but failed to stitch them together into actionable proofs. A finding with a working PoC is far easier to act on than a hedged "possibly vulnerable."

However, organic refusals proved inconsistent - the same task framed differently produced opposite outcomes, and the model sometimes refused legitimate security research after already finding and confirming serious memory bugs. Memory-unsafe languages like C and C++ generated substantially more false positives than Rust, and the model's tendency to hedge findings with "potentially" and "could in theory" creates triage overhead that compounds across thousands of results.

Generic coding agents fail at vulnerability research because they're built for focused, sequential work - exactly the wrong shape for security testing, which requires narrow, parallel investigation across multiple hypotheses.