Quick Answer
Last verified:
High confidence

Cohere API costs $0.04 to $10 per per million tokens as of April 2026, with 4 plans available including a free tier. Plan: Trial (Free) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: Yes

Cohere API offers 4 pricing tiers: Trial (Free), Command R (Pay-as-you-go), Command R+ / Command A (Pay-as-you-go), Embed & Rerank. The Command R (Pay-as-you-go) plan is rag pipelines and cost-efficient enterprise chat.

Compared to other llm api providers software, Cohere API is positioned at the budget-friendly price point.

  • 4 documented hidden costs beyond list price

How much does Cohere API cost?

Cohere API offers 4 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Trial (Free) (free), Command R (Pay-as-you-go) (custom pricing), Command R+ / Command A (Pay-as-you-go) (custom pricing), Embed & Rerank (custom pricing).

Cohere API Pricing Overview

Cohere API has 4 pricing plans, including a free tier. Paid plans range from $0.04 to $10/per million tokens. The Trial (Free) plan is free and is best for evaluation and prototyping. The Command R (Pay-as-you-go) plan requires contacting sales for a custom quote and is designed for rag pipelines and cost-efficient enterprise chat. The Command R+ / Command A (Pay-as-you-go) plan requires contacting sales for a custom quote and is designed for high-accuracy enterprise rag and agentic applications. The Embed & Rerank plan requires contacting sales for a custom quote and is designed for semantic search, rag retrieval, and result re-ranking.

There are at least 4 documented hidden costs beyond Cohere API's list price, including implementation, training, and add-on fees.

This pricing was last verified in April 13, 2026.

Cohere is purpose-built for enterprise retrieval-augmented generation (RAG). The Command R7B model starts at just $0.037 per million input tokens, making it the most affordable in Cohere's lineup for high-volume RAG tasks. Command R runs $0.15/$0.60 per million tokens with native grounding and citation. Command R+ and Command A cost $2.50/$10.00 per million tokens for complex agentic workflows. Cohere's Embed and Rerank APIs round out a complete RAG stack — Embed v3 at $0.10/M tokens and Rerank 3.5 at $2.00 per 1,000 queries. A free Trial key is available for non-commercial evaluation.

All Cohere API Plans & Pricing

Plan Monthly Annual Best For
Trial (Free) usage: Non-commercial/non-production only Free Free Evaluation and prototyping
Command R (Pay-as-you-go) Contact Sales Contact Sales RAG pipelines and cost-efficient enterprise chat
Command R+ / Command A (Pay-as-you-go) Contact Sales Contact Sales High-accuracy enterprise RAG and agentic applications
Embed & Rerank Contact Sales Contact Sales Semantic search, RAG retrieval, and result re-ranking
View all features by plan

Trial (Free)

  • Free trial API key
  • Rate-limited access to all models
  • Not permitted for commercial or production use
  • No credit card required

Command R (Pay-as-you-go)

  • Command R7B: $0.037 input / $0.15 output per M tokens
  • Command R: $0.15 input / $0.60 output per M tokens
  • 128K context window
  • RAG-optimized with grounded generation
  • Multi-step tool use for agentic workflows

Command R+ / Command A (Pay-as-you-go)

  • Command R+ (08-2024): $2.50 input / $10.00 output per M tokens
  • Command A: $2.50 input / $10.00 output per M tokens
  • Command A: 256K context window
  • Advanced tool use and agentic reasoning
  • Enterprise-grade accuracy and reliability

Embed & Rerank

  • Embed v3: $0.10 per million tokens
  • Embed 4: $0.12 per million text tokens, $0.47 per million image tokens
  • Rerank 3.5: $2.00 per 1,000 queries
  • Model Vault: hourly ($4-5/hr) or monthly ($2,500-3,250/mo) for dedicated
  • Multilingual embedding support

How Cohere API Pricing Compares

Software Starting Price Top Price
Cohere API $0.037/per million tokens $10/per million tokens
Groq Free $3/per million tokens
Together AI $0.03/per million tokens / hour $9.95/per million tokens / hour
Fireworks AI Free $9/per million tokens / hour
Google Gemini API Free $18/per million tokens
Mistral AI API $0.1/per million tokens $6/per million tokens

Detailed pricing comparisons:

4 Cohere API Hidden Costs Beyond the List Price

Beyond the listed price, Cohere API has at least 4 documented hidden costs that can significantly increase total cost of ownership.

Watch for 4 hidden costs
  • Trial API keys are free but cannot be used for commercial or production workloads
  • Rerank billed per query (not per token) — high-volume search can get expensive quickly
  • Embed 4 image tokens ($0.47/M) billed at 4x the rate of text tokens
  • Model Vault hourly/monthly pricing for dedicated Embed/Rerank capacity adds fixed costs
Tip

Ask your Cohere API sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Cohere API Pricing FAQ

01 How much does the Cohere API cost?

Cohere API pricing varies by model type. Command R7B (the cheapest generative model) costs $0.037 per million input tokens and $0.15 per million output tokens. Command R costs $0.15/$0.60 per million tokens. Command R+ and Command A cost $2.50/$10.00 per million tokens. Embed v3 costs $0.10 per million tokens. Rerank 3.5 costs $2.00 per 1,000 queries.

02 What is Command R and when should I use it?

Command R is Cohere's retrieval-augmented generation (RAG) model, optimized for grounded responses with cited sources. Use Command R ($0.15/$0.60 per million tokens) for everyday RAG pipelines. Use Command R+ or Command A ($2.50/$10.00 per million tokens) when you need higher accuracy on complex multi-step tasks. Command A features a 256K context window — the largest in Cohere's lineup.

03 What is Cohere Rerank and how is it priced?

Cohere Rerank re-orders search results by relevance using a cross-encoder model, dramatically improving RAG retrieval quality. Rerank 3.5 costs $2.00 per 1,000 queries. For high-volume applications, Cohere also offers hourly ($5/hr) or monthly ($3,250/mo) dedicated Model Vault capacity. Rerank is often used alongside Embed in a two-stage retrieval pipeline.

04 Does Cohere have a free tier?

Yes, Cohere offers a free Trial API key with rate-limited access to all models. However, the Trial key is explicitly not permitted for commercial or production use. For production applications, you need a Production API key, which uses pay-as-you-go billing charged monthly or when you reach $250 in outstanding usage.

05 Cohere vs OpenAI for enterprise RAG: which is better?

Cohere is purpose-built for enterprise RAG and search use cases, while OpenAI is more general-purpose. Cohere's Command R models include native grounding, citation generation, and multi-step tool use optimized for RAG pipelines. Cohere's Embed and Rerank models are best-in-class for retrieval. For pure RAG performance, Cohere often wins. For general-purpose chat or coding tasks, OpenAI or Anthropic may be better choices.

Is this pricing incorrect? — we'll verify and update it.