HeadlinesBriefing favicon HeadlinesBriefing.com

Gemma 3n pushes mobile AI with elastic MatFormer depth

Google DeepMind Blog •
×

Google DeepMind ships Gemma 3n after a thriving Gemmaverse topped 160 million downloads proved demand for lean, capable models. Built for developers who stretched prior releases across robotics, vision and medical workloads, this mobile-first line brings frontier-grade multimodal sense and reason onto devices without cloud calls, pairing native text, image, audio and video inputs with fast on-device tuning through familiar runtimes.

MatFormer drives elastic sizing by nesting a 2B effective parameter (E2B) sub-model inside the 4B (E4B) core, letting teams pick ready weights or slice custom widths via Mix-n-Match. Per-Layer Embeddings keep most weights on CPU so accelerators hold only transformer cores, while KV Cache Sharing doubles prefill speed and USM audio tokens enable speech-to-text and translation for long-tail languages within tight memory.

E4B tops 1300 on LMArena as the first sub-10B model to reach that score, with 140-language text support and 35-language multimodal sense baked in. Developers can pull distilled checkpoints today and deploy vision, voice and reasoning pipelines that respect device constraints without surrendering quality.