gpt.buzz
Sign in

news

New ways to balance cost and reliability in the Gemini API

April 3, 2026

Google is adding two new inference tiers to the Gemini API, Flex and Priority, to give developers more options for balancing cost and latency. The new tiers are meant to improve reliability and performance tradeoffs, making it easier to choose a service level that fits different application needs.

Google is introducing two new inference tiers to the Gemini API, Flex and Priority, to balance cost and latency.

Source: blog.google

← All news