Google Gemini API Pricing 2026
Complete pricing guide with plans, hidden costs, and cost analysis
Google Gemini API pricing ranges from $0 to $18/per million tokens.
Google Gemini API costs Free to $18 per per million tokens as of April 2026, with 4 plans available including a free tier. Plan: Free (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
Google Gemini API offers 4 pricing tiers: Free, Flash-Lite (Paid), Flash (Paid), Pro (Paid). The Flash-Lite (Paid) plan is high-volume, cost-sensitive production workloads.
Compared to other llm api providers software, Google Gemini API is positioned at the budget-friendly price point.
- 4 documented hidden costs beyond list price
How much does Google Gemini API cost?
Google Gemini API Pricing Overview
Google Gemini API has 4 pricing plans, including a free tier. Paid plans range from $0 to $18/per million tokens. The Free plan is free and is best for prototyping and evaluation. The Flash-Lite (Paid) plan requires contacting sales for a custom quote and is designed for high-volume, cost-sensitive production workloads. The Flash (Paid) plan requires contacting sales for a custom quote and is designed for production apps balancing cost and capability. The Pro (Paid) plan requires contacting sales for a custom quote and is designed for complex reasoning, long-context, and multimodal tasks.
There are at least 4 documented hidden costs beyond Google Gemini API's list price, including implementation, training, and add-on fees.
This pricing was last verified in April 15, 2026 from 1 independent sources.
All Google Gemini API Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Free rate_limit: Rate-limited for prototyping | Free | Free | Prototyping and evaluation |
| Flash-Lite (Paid) | Contact Sales | Contact Sales | High-volume, cost-sensitive production workloads |
| Flash (Paid) | Contact Sales | Contact Sales | Production apps balancing cost and capability |
| Pro (Paid) | Contact Sales | Contact Sales | Complex reasoning, long-context, and multimodal tasks |
View all features by plan
Free
- Free API key via Google AI Studio
- Gemini 2.5 Flash-Lite: free input & output
- Gemini 3 Flash Preview: free input & output
- Gemini 3.1 Flash-Lite Preview: free input & output
- Rate-limited for prototyping
- Content used to improve Google products
Flash-Lite (Paid)
- Gemini 2.5 Flash-Lite: $0.10 input / $0.40 output per M tokens
- Gemini 3.1 Flash-Lite Preview: $0.25 input / $1.50 output per M tokens
- Most cost-efficient Gemini models
- Batch API: 50% cost reduction
- Great for high-volume, cost-sensitive workloads
Flash (Paid)
- Gemini 2.5 Flash: $0.30 input / $2.50 output per M tokens
- Gemini 3 Flash Preview: $0.50 input / $3.00 output per M tokens
- Balanced speed and capability
- Multimodal: text, image, video, audio
- Audio input: $1.00/M tokens
Pro (Paid)
- Gemini 2.5 Pro: $1.25 input (≤200K) / $10.00 output per M tokens
- $2.50 input / $15.00 output for prompts >200K tokens
- Gemini 3.1 Pro Preview: $2.00 input (≤200K) / $12.00 output per M tokens
- $4.00 input / $18.00 output for prompts >200K tokens
- Google Search grounding: $14/1,000 queries (5,000/mo free)
- Context caching available (up to 90% input cost reduction)
Usage-Based Rates
Per-unit pricing for Google Gemini API API usage.
Flash-Lite (Paid)
| Model | Unit | Rate |
|---|---|---|
| Gemini 2.5 Flash-Lite | 1M input tokens | $0.1 |
| Gemini 2.5 Flash-Lite | 1M output tokens | $0.4 |
| Gemini 3.1 Flash-Lite Preview | 1M input tokens | $0.25 |
| Gemini 3.1 Flash-Lite Preview | 1M output tokens | $1.5 |
| Gemini 3.1 Flash-Lite Preview | 1M cached input tokens | $0.025 |
- Same rate regardless of context length
- Audio input at $0.50/M tokens for 3.1 Flash-Lite
- Context caching storage: $1.00/M tokens per hour
Flash (Paid)
| Model | Unit | Rate |
|---|---|---|
| Gemini 2.5 Flash | 1M input tokens | $0.3 |
| Gemini 2.5 Flash | 1M output tokens | $2.5 |
| Gemini 2.5 Flash (thinking) | 1M output tokens | $3.5 |
| Gemini 3 Flash Preview | 1M input tokens | $0.5 |
| Gemini 3 Flash Preview | 1M output tokens | $3 |
| Gemini 3 Flash Preview | 1M cached input tokens | $0.05 |
- Thinking/reasoning output billed at higher rate for 2.5 Flash
- 3 Flash Preview output price includes thinking tokens
- Audio input at $1.00/M tokens
- Context caching storage: $1.00/M tokens per hour
Pro (Paid)
| Model | Unit | Rate |
|---|---|---|
| Gemini 2.5 Pro (≤200K ctx) | 1M input tokens | $1.25 |
| Gemini 2.5 Pro (>200K ctx) | 1M input tokens | $2.5 |
| Gemini 2.5 Pro | 1M output tokens | $10 |
| Gemini 2.5 Pro (thinking) | 1M output tokens | $15 |
| Gemini 3.1 Pro Preview (≤200K ctx) | 1M input tokens | $2 |
| Gemini 3.1 Pro Preview (>200K ctx) | 1M input tokens | $4 |
| Gemini 3.1 Pro Preview (≤200K ctx) | 1M output tokens | $12 |
| Gemini 3.1 Pro Preview (>200K ctx) | 1M output tokens | $18 |
| Gemini 3.1 Pro Preview (≤200K ctx) | 1M cached input tokens | $0.2 |
| Gemini 3.1 Pro Preview (>200K ctx) | 1M cached input tokens | $0.4 |
- Input price doubles above 200K context window for both models
- 2.5 Pro has separate thinking output rate; 3.1 Pro output includes thinking
- Context caching storage: $4.50/M tokens per hour for 3.1 Pro
How Google Gemini API Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Google Gemini API | Free | $18/per million tokens |
| Groq | Free | $3/per million tokens |
| Together AI | $0.03/per million tokens / hour | $9.95/per million tokens / hour |
| Fireworks AI | Free | $9/per million tokens / hour |
| Mistral AI API | $0.1/per million tokens | $6/per million tokens |
| Perplexity API | $1/per million tokens + per-request fee | $15/per million tokens + per-request fee |
Detailed pricing comparisons:
Google Gemini API Pricing FAQ
01 How much does the Google Gemini API cost?
Gemini API pricing varies by model. The cheapest option is Gemini 2.5 Flash-Lite at $0.10 per million input tokens and $0.40 per million output tokens. Gemini 2.5 Pro costs $1.25/$10.00 per million tokens (≤200K context). A free tier is available with up to 1,500 requests/day on Flash models via Google AI Studio.
02 Is the Gemini API free?
Yes, Google offers a free tier for the Gemini API through Google AI Studio. The free tier provides access to Flash models with up to 1,500 requests/day and free input/output tokens. Pro models also have a free tier but are rate-limited. For production use, you pay per token on the paid tier with no monthly minimum.
03 Gemini API vs OpenAI API: which is cheaper?
Gemini is generally cheaper than OpenAI for comparable models. Gemini 2.5 Flash at $0.30/$2.50 per million tokens is significantly cheaper than GPT-4o. Gemini 2.5 Pro at $1.25/$10.00 per million tokens undercuts GPT-4o pricing. For budget workloads, Gemini Flash-Lite at $0.10/$0.40 per million tokens has no OpenAI equivalent at that price.
04 What is context caching in the Gemini API?
Context caching lets you cache repeated prompt content (like system instructions or documents) and reuse it across multiple requests. Cached tokens are billed at roughly 90% discount compared to fresh input tokens. This is highly cost-effective for applications that repeatedly process the same large documents or instructions.
05 What is the Batch API discount on Gemini?
The Gemini API Batch API offers a 50% cost reduction on token pricing for asynchronous workloads. Batch requests are processed within 24 hours. This is ideal for offline data processing, bulk classification, or any task that doesn't require real-time responses.
Is this pricing incorrect? — we'll verify and update it.