Quick Answer
Last verified:
Medium confidence

Groq costs Free to $0.79 per per million tokens as of April 2026, with 3 plans available including a free tier. Plan: Free (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: Yes

Groq offers 3 pricing tiers: Free, Developer, Enterprise. The Developer plan is production api usage.

Compared to other llm api providers software, Groq is positioned at the budget-friendly price point.

How much does Groq cost?

Groq offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Free (free), Developer (custom pricing), Enterprise (custom pricing).

Groq Pricing Overview

Groq has 3 pricing plans, including a free tier. Paid plans range from $0 to $0.79/per million tokens. The Free plan is free and is best for prototyping and evaluation. The Developer plan requires contacting sales for a custom quote and is designed for production api usage. The Enterprise plan requires contacting sales for a custom quote and is designed for high-volume enterprise deployments.

This pricing was last verified in April 1, 2026.

Groq offers ultra-fast LLM inference powered by custom LPU hardware, with a free tier for getting started and pay-as-you-go Developer pricing starting at $0.05 per million input tokens for Llama 3.1 8B. Larger models like Llama 3.3 70B cost $0.59/$0.79 per million tokens in/out. Groq achieves speeds of 500–1,000+ tokens per second, making it one of the fastest inference providers available.

All Groq Plans & Pricing

Plan Monthly Annual Best For
Free rate_limit: Limited requests per minute Free Free Prototyping and evaluation
Developer Contact Sales Contact Sales Production API usage
Enterprise Contact Sales Contact Sales High-volume enterprise deployments
View all features by plan

Free

  • Free API key
  • Pay-as-you-go access
  • All models available
  • Rate-limited

Developer

  • Llama 3.1 8B at $0.05/$0.08 per M tokens (in/out)
  • Llama 4 Scout at $0.11/$0.34 per M tokens
  • Qwen3 32B at $0.29/$0.59 per M tokens
  • Llama 3.3 70B at $0.59/$0.79 per M tokens
  • GPT OSS 20B at $0.075/$0.30 per M tokens
  • Up to 1,000 tokens/second

Enterprise

  • Dedicated support
  • Custom rate limits
  • Large-scale solutions
  • SLA guarantees

How Groq Pricing Compares

Software Starting Price Top Price
Groq Free $0.79/per million tokens
Together AI $0.1/per million tokens / hour $9.95/per million tokens / hour
Fireworks AI Free $9/per million tokens / hour

Detailed pricing comparisons:

Groq Pricing FAQ

01 How much does Groq API cost?

Groq API pricing is per-token and varies by model. The cheapest option is Llama 3.1 8B at $0.05 per million input tokens and $0.08 per million output tokens. Larger models like Llama 3.3 70B cost $0.59/$0.79 per million tokens. Groq offers a free API key with rate limits for getting started.

02 Does Groq have a free tier?

Yes, Groq offers a free API key with access to all models. The Free tier has rate limits on requests per minute. You can upgrade to the Developer plan for higher limits with pay-as-you-go token pricing.

03 Why is Groq so fast?

Groq uses custom LPU (Language Processing Unit) hardware designed specifically for AI inference, achieving speeds of 500–1,000+ tokens per second. This makes it one of the fastest LLM inference providers, particularly for real-time applications.

04 Groq vs OpenAI: which is cheaper for API usage?

Groq is typically cheaper for open-source models — Llama 3.1 8B costs $0.05/$0.08 per million tokens on Groq versus paying OpenAI rates for GPT models. Groq doesn't offer GPT models, so for OpenAI-specific models there's no direct comparison. Groq's main advantage is its speed alongside competitive pricing.

05 What models does Groq support?

Groq supports models including Llama 3.1 8B, Llama 4 Scout, Llama 3.3 70B, Qwen3 32B, and GPT OSS 20B, with pricing ranging from $0.05 to $0.59 per million input tokens. Enterprise plans support custom rate limits and SLA guarantees for high-volume deployments.

Is this pricing incorrect? — we verify and update within 24 hours.