HeadlinesBriefing favicon HeadlinesBriefing.com

Google Releases Gemini 2.5 Flash with Controllable Reasoning

Google DeepMind Blog •
×

Google has launched Gemini 2.5 Flash in preview, expanding on its 2.0 Flash foundation with a hybrid reasoning model that lets developers toggle thinking on or off. The new version delivers sharper reasoning while retaining the speed and low cost that made 2.0 Flash popular.

The model introduces a thinking_budget parameter, capped at 24,576 tokens, allowing fine‑grained control over how much the model can deliberate before answering. Developers can set budgets from zero to the maximum, automatically scaling reasoning depth based on task complexity and keeping costs predictable.

Benchmarks show Gemini 2.5 Flash ranks second only to 2.5 Pro on LMArena Hard Prompts, matching leading models’ performance at a fraction of the size and price. With this pricing edge, the model occupies a prominent spot on Google’s Pareto frontier of cost versus quality.

By offering a configurable reasoning engine, Google equips teams to balance latency, accuracy, and expense in real‑time applications, tightening the gap between developer intent and AI output without sacrificing speed.