gpt.buzz
Sign in

Models/Google

Google logoGemini 2.5 Flash

Google · Geminireleased 2025-06-17proprietaryupdated 24 days ago

Google's cost-and-latency-optimized Gemini 2.5 variant. Generally available June 2025 — designed for high-throughput agentic workflows with sub-second first-token latency.

Specifications

Context window
1,000,000 tokens
Modality
text, vision, audio
License
proprietary
Family
Gemini
Release date
2025-06-17

Provider status

Google APIAll Gemini / Vertex AI services operational
Last incident: Vertex AI Gemini API customers experienced increased error rates when accessing the global endpoint.4 months ago
30-day uptime100.00%
View Google's status page →

Timeline

  1. Released

    Initial public availability.

  2. Pricing changes, lineage updates, and new benchmark results appear here as they happen. See the releases feed for the latest vendor activity.

API pricing

TierInput / 1M tokensOutput / 1M tokensCache readAs of
standard$0.30$2.502025-06-17 · source ↗

Looking for consumer subscriptions? See Google's plans →

Benchmarks

Model Index →

No benchmark scores recorded yet for Gemini 2.5 Flash. We surface official vendor + Artificial Analysis numbers as soon as a model ships them.

Infrastructure context

All intelligence →

Compute, silicon, and capex events that shape Gemini 2.5 Flash's economics.

  • Power

    Google Stone Mountain GA campus expanding to 2GW with on-site solar+gas hybrid

    Google's Stone Mountain, Georgia data center cluster — built around Gemini training — disclosed a 2GW expansion plan in early 2026 backed by a 1.6GW solar+gas hybrid generation deal with Southern Company. Site will host the second-largest TPU v7 deployment after Council Bluffs, IA.

    Power: 2 GW
    Location: Stone Mountain, GA
    Southern Company Q4 2025 earnings
  • Silicon

    Google TPU v7 "Ironwood": 9,216-chip pods, 42.5 exaflops, dedicated to inference

    Announced at Google Cloud Next April 2026, TPU v7 (Ironwood) is the first Google TPU generation purpose-built for inference rather than training. Each pod scales to 9,216 chips delivering 42.5 exaflops of FP8 compute, with 192GB HBM3e per chip. Powers Gemini 3 Pro / 3.5 inference at Google scale. Annual TPU spend ramped to over $40B for FY26.

    Accelerators: TPU v7 Ironwood
    Google Cloud Next 2026 keynote

Compare Gemini 2.5 Flash with…

Related news

No tagged articles yet. The aggregator surfaces mentions every 15 minutes.