Cohere API Pricing 2026
Complete pricing guide with plans, hidden costs, and cost analysis
Cohere API pricing ranges from $0.04 to $10/per million tokens.
Cohere API costs $0.04 to $10 per per million tokens as of April 2026, with 4 plans available including a free tier. Plan: Trial (Free) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
Cohere API offers 4 pricing tiers: Trial (Free), Command R (Pay-as-you-go), Command R+ / Command A (Pay-as-you-go), Embed & Rerank. The Command R (Pay-as-you-go) plan is rag pipelines and cost-efficient enterprise chat.
Compared to other llm api providers software, Cohere API is positioned at the budget-friendly price point.
- 4 documented hidden costs beyond list price
How much does Cohere API cost?
Cohere API Pricing Overview
Cohere API has 4 pricing plans, including a free tier. Paid plans range from $0.04 to $10/per million tokens. The Trial (Free) plan is free and is best for evaluation and prototyping. The Command R (Pay-as-you-go) plan requires contacting sales for a custom quote and is designed for rag pipelines and cost-efficient enterprise chat. The Command R+ / Command A (Pay-as-you-go) plan requires contacting sales for a custom quote and is designed for high-accuracy enterprise rag and agentic applications. The Embed & Rerank plan requires contacting sales for a custom quote and is designed for semantic search, rag retrieval, and result re-ranking.
There are at least 4 documented hidden costs beyond Cohere API's list price, including implementation, training, and add-on fees.
This pricing was last verified in April 13, 2026.
All Cohere API Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Trial (Free) usage: Non-commercial/non-production only | Free | Free | Evaluation and prototyping |
| Command R (Pay-as-you-go) | Contact Sales | Contact Sales | RAG pipelines and cost-efficient enterprise chat |
| Command R+ / Command A (Pay-as-you-go) | Contact Sales | Contact Sales | High-accuracy enterprise RAG and agentic applications |
| Embed & Rerank | Contact Sales | Contact Sales | Semantic search, RAG retrieval, and result re-ranking |
View all features by plan
Trial (Free)
- Free trial API key
- Rate-limited access to all models
- Not permitted for commercial or production use
- No credit card required
Command R (Pay-as-you-go)
- Command R7B: $0.037 input / $0.15 output per M tokens
- Command R: $0.15 input / $0.60 output per M tokens
- 128K context window
- RAG-optimized with grounded generation
- Multi-step tool use for agentic workflows
Command R+ / Command A (Pay-as-you-go)
- Command R+ (08-2024): $2.50 input / $10.00 output per M tokens
- Command A: $2.50 input / $10.00 output per M tokens
- Command A: 256K context window
- Advanced tool use and agentic reasoning
- Enterprise-grade accuracy and reliability
Embed & Rerank
- Embed v3: $0.10 per million tokens
- Embed 4: $0.12 per million text tokens, $0.47 per million image tokens
- Rerank 3.5: $2.00 per 1,000 queries
- Model Vault: hourly ($4-5/hr) or monthly ($2,500-3,250/mo) for dedicated
- Multilingual embedding support
How Cohere API Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Cohere API | $0.037/per million tokens | $10/per million tokens |
| Groq | Free | $3/per million tokens |
| Together AI | $0.03/per million tokens / hour | $9.95/per million tokens / hour |
| Fireworks AI | Free | $9/per million tokens / hour |
| Google Gemini API | Free | $18/per million tokens |
| Mistral AI API | $0.1/per million tokens | $6/per million tokens |
Detailed pricing comparisons:
Cohere API Pricing FAQ
01 How much does the Cohere API cost?
Cohere API pricing varies by model type. Command R7B (the cheapest generative model) costs $0.037 per million input tokens and $0.15 per million output tokens. Command R costs $0.15/$0.60 per million tokens. Command R+ and Command A cost $2.50/$10.00 per million tokens. Embed v3 costs $0.10 per million tokens. Rerank 3.5 costs $2.00 per 1,000 queries.
02 What is Command R and when should I use it?
Command R is Cohere's retrieval-augmented generation (RAG) model, optimized for grounded responses with cited sources. Use Command R ($0.15/$0.60 per million tokens) for everyday RAG pipelines. Use Command R+ or Command A ($2.50/$10.00 per million tokens) when you need higher accuracy on complex multi-step tasks. Command A features a 256K context window — the largest in Cohere's lineup.
03 What is Cohere Rerank and how is it priced?
Cohere Rerank re-orders search results by relevance using a cross-encoder model, dramatically improving RAG retrieval quality. Rerank 3.5 costs $2.00 per 1,000 queries. For high-volume applications, Cohere also offers hourly ($5/hr) or monthly ($3,250/mo) dedicated Model Vault capacity. Rerank is often used alongside Embed in a two-stage retrieval pipeline.
04 Does Cohere have a free tier?
Yes, Cohere offers a free Trial API key with rate-limited access to all models. However, the Trial key is explicitly not permitted for commercial or production use. For production applications, you need a Production API key, which uses pay-as-you-go billing charged monthly or when you reach $250 in outstanding usage.
05 Cohere vs OpenAI for enterprise RAG: which is better?
Cohere is purpose-built for enterprise RAG and search use cases, while OpenAI is more general-purpose. Cohere's Command R models include native grounding, citation generation, and multi-step tool use optimized for RAG pipelines. Cohere's Embed and Rerank models are best-in-class for retrieval. For pure RAG performance, Cohere often wins. For general-purpose chat or coding tasks, OpenAI or Anthropic may be better choices.
Is this pricing incorrect? — we'll verify and update it.