Fireworks AI Pricing 2026
Complete pricing guide with plans, and cost analysis
Fireworks AI pricing ranges from $0 to $9/per million tokens / hour.
Fireworks AI costs Free to $9 per per million tokens / hour as of April 2026, with 5 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: No free tier available
Fireworks AI offers 5 pricing tiers: Serverless, On-Demand (A100), On-Demand (H100/H200), On-Demand (B200), Enterprise. The On-Demand (A100) plan is consistent inference workloads.
Compared to other llm api providers software, Fireworks AI is positioned at the budget-friendly price point.
How much does Fireworks AI cost?
Fireworks AI Pricing Overview
Fireworks AI has 5 pricing plans ranging from $0 to $9/per million tokens / hour. The Serverless plan requires contacting sales for a custom quote and is designed for variable-volume api usage. The On-Demand (A100) plan requires contacting sales for a custom quote and is designed for consistent inference workloads. The On-Demand (H100/H200) plan requires contacting sales for a custom quote and is designed for large model hosting. The On-Demand (B200) plan requires contacting sales for a custom quote and is designed for cutting-edge performance. The Enterprise plan requires contacting sales for a custom quote and is designed for large-scale enterprise deployments.
This pricing was last verified in April 1, 2026.
All Fireworks AI Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Serverless | Contact Sales | Contact Sales | Variable-volume API usage |
| On-Demand (A100) | Contact Sales | Contact Sales | Consistent inference workloads |
| On-Demand (H100/H200) | Contact Sales | Contact Sales | Large model hosting |
| On-Demand (B200) | Contact Sales | Contact Sales | Cutting-edge performance |
| Enterprise | Contact Sales | Contact Sales | Large-scale enterprise deployments |
View all features by plan
Serverless
- $1 free credits to start
- Models <4B at $0.10/M tokens
- Models 4B-16B at $0.20/M tokens
- Models >16B at $0.90/M tokens
- MoE models at $0.50-$1.20/M tokens
- Cached input tokens at 50% price
- Batch inference at 50% discount
- Embeddings from $0.008/M tokens
On-Demand (A100)
- A100 80GB at $2.90/hr
- Dedicated model hosting
- Custom fine-tuned models
On-Demand (H100/H200)
- H100/H200 at $6.00/hr
- High-performance inference
- Dedicated resources
On-Demand (B200)
- B200 at $9.00/hr
- Latest generation hardware
- Maximum throughput
Enterprise
- Volume discounts
- Dedicated support
- Custom SLAs
How Fireworks AI Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Fireworks AI | Free | $9/per million tokens / hour |
| Groq | Free | $0.79/per million tokens |
| Together AI | $0.1/per million tokens / hour | $9.95/per million tokens / hour |
Detailed pricing comparisons:
Fireworks AI Pricing FAQ
01 How much does Fireworks AI cost?
Fireworks AI serverless pricing starts at $0.10 per million tokens for small models (<4B parameters) and goes up to $0.90/M for models over 16B. On-demand GPU deployments range from $2.90/hr (A100) to $9.00/hr (B200). New accounts get $1 in free credits.
02 Does Fireworks AI have a free tier?
Fireworks AI offers $1 in free credits for new accounts. After that, pricing is pay-as-you-go with no minimum commitment. Batch inference and cached input tokens each offer 50% discounts, reducing ongoing costs.
03 How does Fireworks AI fine-tuning work?
Fireworks AI supports fine-tuning with SFT and DPO methods. Pricing ranges from $0.50/M training tokens for models under 16B to $10–20/M tokens for models over 300B. Fine-tuned models can be deployed on Serverless or dedicated infrastructure.
04 Fireworks AI vs Together AI: which should I choose?
Both offer serverless inference starting at $0.10/M tokens. Fireworks AI provides $1 free credits upfront and offers A100 On-Demand at $2.90/hr, while Together AI's comparable H100 dedicated is $3.99/hr. Fireworks AI is generally slightly cheaper for dedicated GPU hosting and offers batch discounts of 50%.
05 What is Fireworks AI On-Demand pricing?
Fireworks AI On-Demand GPU deployments are priced at $2.90/hr for A100 80GB, $6.00/hr for H100/H200, and $9.00/hr for B200. These are dedicated single-tenant deployments ideal for hosting custom fine-tuned models or maintaining consistent inference capacity.
Is this pricing incorrect? — we verify and update within 24 hours.