HeadlinesBriefing favicon HeadlinesBriefing.com

Google Unveils Efficient Gemini 3.5 Flash

Ars Technica •
×

Google has announced Gemini 3.5 Flash, positioning the model as a more efficient alternative while maintaining frontier-level intelligence. The new model outputs nearly 300 tokens per second, matching benchmark scores of larger models that build outputs at a quarter of that speed. Google claims this efficiency makes complex agentic tasks viable at scale, with improvements in code generation and tool use performance based on user feedback.

The tech giant also introduced Gemini Spark, which runs 3.5 Flash as a dedicated 24/7 agent in Google's cloud. Spark handles multiple parallel workflows across a user's Google footprint, processing emails, meetings, and files autonomously. Google launched a new $100-per-month Ultra tier for access, while maintaining the existing $200 tier for higher token limits.

Google further announced Gemini Omni Flash to replace its Veo video model, designed as a multimodal system accepting any input data and producing various outputs. While starting with video generation, Google plans to expand Omni's capabilities to include images, text, and audio, representing its push toward practical AI applications balancing performance with efficiency.