Together AI Pricing 2026
Complete pricing guide with plans, and cost analysis
Together AI pricing ranges from $0.10 to $9.95/per million tokens / hour.
Together AI costs $0.10 to $9.95 per per million tokens / hour as of April 2026, with 4 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: No free tier available
Together AI offers 4 pricing tiers: Serverless, Dedicated (1x H100), Dedicated (1x B200), Enterprise. The Dedicated (1x H100) plan is consistent high-volume inference.
Compared to other llm api providers software, Together AI is positioned at the budget-friendly price point.
How much does Together AI cost?
Together AI Pricing Overview
Together AI has 4 pricing plans ranging from $0.10 to $9.95/per million tokens / hour. The Serverless plan requires contacting sales for a custom quote and is designed for variable-volume api usage. The Dedicated (1x H100) plan requires contacting sales for a custom quote and is designed for consistent high-volume inference. The Dedicated (1x B200) plan requires contacting sales for a custom quote and is designed for high-performance dedicated inference. The Enterprise plan requires contacting sales for a custom quote and is designed for large-scale enterprise deployments.
This pricing was last verified in April 1, 2026.
All Together AI Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Serverless | Contact Sales | Contact Sales | Variable-volume API usage |
| Dedicated (1x H100) | Contact Sales | Contact Sales | Consistent high-volume inference |
| Dedicated (1x B200) | Contact Sales | Contact Sales | High-performance dedicated inference |
| Enterprise | Contact Sales | Contact Sales | Large-scale enterprise deployments |
View all features by plan
Serverless
- Pay-as-you-go per-token pricing
- Budget models from $0.10/M tokens
- Mid-range models from $0.50/M tokens
- Large models from $3.00/M tokens
- Batch API with 40-50% discount
- Image generation from $0.0006/image
- Embeddings at $0.02/M tokens
Dedicated (1x H100)
- Single-tenant GPU deployment
- 1x H100 at $3.99/hr
- Custom model hosting
- Dedicated resources
Dedicated (1x B200)
- Single-tenant GPU deployment
- 1x B200 at $9.95/hr
- Latest generation hardware
- Dedicated resources
Enterprise
- Volume discounts
- Dedicated support
- Custom SLAs
- Private deployments
How Together AI Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Together AI | $0.1/per million tokens / hour | $9.95/per million tokens / hour |
| Groq | Free | $0.79/per million tokens |
| Fireworks AI | Free | $9/per million tokens / hour |
Detailed pricing comparisons:
Together AI Pricing FAQ
01 How much does Together AI cost?
Together AI offers serverless inference starting at $0.10 per million tokens for small models. Mid-range models cost $0.50–1.00/M tokens, and large models like DeepSeek-R1 cost $3.00/M tokens. Dedicated GPU deployments start at $3.99/hr (1x H100) or $9.95/hr (1x B200). Batch processing saves 40–50%.
02 Does Together AI have a free tier?
Together AI does not advertise a permanent free tier or free credits on their pricing page. They offer pay-as-you-go Serverless pricing with no minimum commitment, so you only pay for what you use.
03 What models does Together AI support?
Together AI supports a wide range of open-source models including Llama, DeepSeek, Qwen, Mistral, and Kimi. They also offer image generation (FLUX, Stable Diffusion), video (Google Veo 2.0), audio transcription, text-to-speech, and embedding models.
04 Together AI vs Fireworks AI: which is cheaper?
Both offer similar serverless per-token pricing starting around $0.10/M tokens for small models. Fireworks AI gives new users $1 in free credits. For dedicated GPU hosting, Together AI's H100 is $3.99/hr versus Fireworks AI's A100 at $2.90/hr, making Fireworks slightly cheaper for dedicated compute at equivalent GPU tiers.
05 What is Together AI's Dedicated GPU pricing?
Together AI's Dedicated GPU hosting starts at $3.99/hr for a 1x H100 (single-tenant) and $9.95/hr for a 1x B200 (latest generation). Dedicated deployments are best for consistent high-volume inference where you need guaranteed resources and custom model hosting.
Is this pricing incorrect? — we verify and update within 24 hours.