Gemini 3 Flash rolls out globally in Google Search, Gemini app and APIs

Google has expanded the Gemini 3 model family with the launch of Gemini 3 Flash, a new AI model designed to combine advanced reasoning with low latency and cost efficiency. The release follows the introduction of Gemini 3 Pro and Gemini 3 Deep Think and extends Gemini 3 capabilities across Search, consumer applications, developer tools, and enterprise services.

Gemini 3 Flash: Core capabilities

Gemini 3 Flash is built on the same reasoning foundation as Gemini 3 Pro and supports complex reasoning, multimodal understanding, vision tasks, tool use, and agentic workflows. The model dynamically adjusts its reasoning depth based on task complexity, enabling faster responses for simpler queries and deeper processing for advanced use cases.

It supports inputs across text, images, audio, and video, and adds advanced visual and spatial reasoning. The model also enables code execution for operations such as zooming, counting, and editing visual inputs.

Benchmarks and efficiency

Gemini 3 Flash delivers frontier-level performance across multiple reasoning and multimodal benchmarks:

  • GPQA Diamond: 90.4%
  • Humanity’s Last Exam: 33.7% (without tools)
  • MMMU Pro: 81.2%

Compared to Gemini 2.5 Pro, Gemini 3 Flash uses around 30% fewer tokens on average on typical workloads. Based on Artificial Analysis benchmarking, it achieves up to three times faster inference while operating at a lower cost. Even at lower reasoning levels, the model often outperforms earlier versions running at higher reasoning settings.

Pricing and cost controls

Gemini 3 Flash pricing is set at:

  • $0.50 per 1 million input tokens
  • $3 per 1 million output tokens
  • $1 per 1 million audio input tokens

The model supports context caching, which can reduce costs by up to 90% in workloads with repeated token usage. It is also available through the Batch API, enabling up to 50% cost savings and higher rate limits for asynchronous processing. Paid API customers receive production-ready rate limits for synchronous and near real-time use cases.

Developer performance and use cases

For developers, Gemini 3 Flash is optimized for iterative and high-frequency workflows. On SWE-bench Verified, it achieves a 78% score, outperforming the Gemini 2.5 series and Gemini 3 Pro while maintaining faster response times.

The model supports use cases including agentic coding, video analysis, visual question answering, data extraction, document analysis, and near real-time reasoning. It is being used in areas such as game development, deepfake detection, and legal document analysis, where both speed and accuracy are required.

For enterprises, the model supports production-scale deployments that require fast inference, consistent reasoning, and cost control.

Gemini 3 Flash in Search AI Mode

Gemini 3 Flash is rolling out globally as the default model for AI Mode in Google Search. With this update, AI Mode can better interpret nuanced queries, consider multiple constraints, and return structured responses while maintaining search-level speed.

AI Mode continues to provide real-time information and links from across the web, supporting research, planning, comparisons, and learning tasks.

Expanded Pro model access in Search (U.S.)

Google is expanding access to Gemini 3 Pro in Search for users in the U.S. By selecting “Thinking with 3 Pro” in AI Mode, users can access deeper reasoning, interactive visual layouts, simulations, and AI creation tools.

Access to Nano Banana Pro (Gemini 3 Pro Image) is also expanding in the U.S., enabling image generation and editing within Search. Higher usage limits apply to Google AI Pro and Ultra subscribers.

Gemini 3 Flash in the Gemini app

Gemini 3 Flash is rolling out globally in the Gemini app as the default model. The app provides a “Fast” mode for quick responses and a “Thinking” mode for more complex problem-solving. Gemini 3 Pro remains available in the model picker for advanced math and coding tasks.

Availability
  • Search: Rolling out globally as the default AI Mode model in Google Search
  • Gemini app: Rolling out globally as the default model
  • Developers: Available via Gemini API in Google AI Studio, Gemini CLI, Google Antigravity, and Android Studio
  • Enterprise: Available through Vertex AI and Gemini Enterprise
  • U.S.-only (for now): Gemini 3 Pro and Nano Banana Pro access in Search, with higher usage limits for Google AI Pro and Ultra subscribers


Related Post