HeadlinesBriefing favicon HeadlinesBriefing.com

Gemini 2.5 Model Updates: Faster, Cheaper AI Reasoning Tools

Google DeepMind Blog •
×

Google DeepMind announced stability for its Gemini 2.5 Pro and Flash models while introducing a new cost-efficient variant, Gemini 2.5 Flash-Lite. The $0.30 per 1 million input tokens pricing for Flash reflects significant value optimization, down from $3.50 output costs previously. Developers now face unified pricing across thinking/non-thinking modes, eliminating prior confusion.

Gemini 2.5 Flash-Lite enters preview as a specialized model for high-throughput tasks like classification. With lower latency and higher tokens-per-second decode rates, it targets cost-sensitive applications while retaining access to tools like Google Search grounding. Its reasoning capabilities activate only when developers enable thinking budgets via API parameters.

The Gemini 2.5 Pro model continues dominant adoption in developer tools, maintaining its $0.15 per 1 million input tokens price. Its stability aligns with surging demand for complex coding and agentic workflows. Google plans to phase out preview versions by mid-2025, urging transitions to stable endpoints.

Pricing adjustments prioritize Flash’s performance-value balance while positioning Flash-Lite as a budget alternative. Developers using preview models must migrate before July 15 (Flash) and June 19 (Pro) deprecation dates. These updates emphasize Google’s focus on scalable, affordable AI reasoning across diverse use cases.