Gemini 2.5 Flash

Name: Gemini 2.5 Flash
Author: Google

Google · Geminireleased 2025-06-17proprietaryupdated 24 days ago

Google's cost-and-latency-optimized Gemini 2.5 variant. Generally available June 2025 — designed for high-throughput agentic workflows with sub-second first-token latency.

Specifications

Context window: 1,000,000 tokens
Modality: text, vision, audio
License: proprietary
Family: Gemini
Release date: 2025-06-17

Provider status

Google APIAll Gemini / Vertex AI services operational

Last incident: Vertex AI Gemini API customers experienced increased error rates when accessing the global endpoint.4 months ago

30-day uptime100.00%

View Google's status page →

Timeline

Released · 2025-06-17
Initial public availability.
Pricing changes, lineage updates, and new benchmark results appear here as they happen. See the releases feed for the latest vendor activity.

API pricing

Tier	Input / 1M tokens	Output / 1M tokens	Cache read	As of
standard	$0.30	$2.50	—	2025-06-17 · source ↗

Looking for consumer subscriptions? See Google's plans →

Benchmarks

Model Index →

No benchmark scores recorded yet for Gemini 2.5 Flash. We surface official vendor + Artificial Analysis numbers as soon as a model ships them.

Infrastructure context

All intelligence →

Compute, silicon, and capex events that shape Gemini 2.5 Flash's economics.

PowerFeb 4, 2026
Google Stone Mountain GA campus expanding to 2GW with on-site solar+gas hybrid
Google's Stone Mountain, Georgia data center cluster — built around Gemini training — disclosed a 2GW expansion plan in early 2026 backed by a 1.6GW solar+gas hybrid generation deal with Southern Company. Site will host the second-largest TPU v7 deployment after Council Bluffs, IA.
Power: 2 GW
Location: Stone Mountain, GA
Southern Company Q4 2025 earnings ↗
SiliconApr 9, 2026
Google TPU v7 "Ironwood": 9,216-chip pods, 42.5 exaflops, dedicated to inference
Announced at Google Cloud Next April 2026, TPU v7 (Ironwood) is the first Google TPU generation purpose-built for inference rather than training. Each pod scales to 9,216 chips delivering 42.5 exaflops of FP8 compute, with 192GB HBM3e per chip. Powers Gemini 3 Pro / 3.5 inference at Google scale. Annual TPU spend ramped to over $40B for FY26.
Accelerators: TPU v7 Ironwood
Google Cloud Next 2026 keynote ↗

Compare Gemini 2.5 Flash with…

Gemini 3.5Google Gemini OmniGoogle Claude 4.8 OpusAnthropic + Pick others →

Related news

No tagged articles yet. The aggregator surfaces mentions every 15 minutes.