HeadlinesBriefing favicon HeadlinesBriefing.com

DeepMind Enhances AI Safety Framework

Google DeepMind Blog •
×

Google DeepMind has released its third iteration of the Frontier Safety Framework, introducing a Critical Capability Level targeting harmful manipulation. This new designation specifically addresses AI models with powerful capabilities to systematically alter beliefs and behaviors in high-stakes contexts, potentially causing severe harm at scale.

The Framework now expands to address misalignment risks where AI might interfere with operators' control functions. DeepMind has also added Tracked Capability Levels to identify emerging risks earlier. These updates strengthen governance protocols for both external launches and internal deployments when specific capability thresholds are reached.

These updates represent DeepMind's scientific approach to managing AI risks as capabilities advance. By defining specific capability levels and corresponding governance measures, the company aims to ensure transformative AI benefits humanity while minimizing potential harms through increasingly sophisticated safety evaluations.