Together AI Pricing 2026: $0.1-$9.95/per million tokens / hour

Price checkPer per million tokens

ServerlessCustom Dedicated (1x H100)Custom Dedicated (1x B200)Custom

Quick Answer

Last verified: April 1, 2026

Medium confidence

Together AI costs $0.10 to $9.95 per per million tokens / hour as of April 2026, with 4 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: No free tier available

Together AI offers 4 pricing tiers: Serverless, Dedicated (1x H100), Dedicated (1x B200), Enterprise. The Dedicated (1x H100) plan is consistent high-volume inference.

Compared to other llm api providers software, Together AI is positioned at the budget-friendly price point.

How much does Together AI cost?

Together AI pricing starts at $0.10/per million tokens / hour across 4 plans, with enterprise pricing available on request. Plans include Serverless (custom pricing), Dedicated (1x H100) (custom pricing), Dedicated (1x B200) (custom pricing), Enterprise (custom pricing).

Together AI Pricing Overview

Together AI has 4 pricing plans ranging from $0.10 to $9.95/per million tokens / hour. The Serverless plan requires contacting sales for a custom quote and is designed for variable-volume api usage. The Dedicated (1x H100) plan requires contacting sales for a custom quote and is designed for consistent high-volume inference. The Dedicated (1x B200) plan requires contacting sales for a custom quote and is designed for high-performance dedicated inference. The Enterprise plan requires contacting sales for a custom quote and is designed for large-scale enterprise deployments.

This pricing was last verified in April 1, 2026.

See Together AI Plans

Together AI provides serverless LLM inference and dedicated GPU hosting, with serverless pricing starting at $0.10 per million tokens for small models and scaling to $3.00+ per million tokens for large models. Dedicated GPU deployments are available starting at $3.99/hour for a 1x H100 and $9.95/hour for a 1x B200. Batch API processing offers 40–50% discounts over standard serverless rates.

All Together AI Plans & Pricing

Plan	Monthly	Annual	Best For
Serverless	Contact Sales	Contact Sales	Variable-volume API usage
Dedicated (1x H100)	Contact Sales	Contact Sales	Consistent high-volume inference
Dedicated (1x B200)	Contact Sales	Contact Sales	High-performance dedicated inference
Enterprise	Contact Sales	Contact Sales	Large-scale enterprise deployments

View all features by plan

Serverless

Pay-as-you-go per-token pricing
Budget models from $0.10/M tokens
Mid-range models from $0.50/M tokens
Large models from $3.00/M tokens
Batch API with 40-50% discount
Image generation from $0.0006/image
Embeddings at $0.02/M tokens

Dedicated (1x H100)

Single-tenant GPU deployment
1x H100 at $3.99/hr
Custom model hosting
Dedicated resources

Dedicated (1x B200)

Single-tenant GPU deployment
1x B200 at $9.95/hr
Latest generation hardware
Dedicated resources

Enterprise

Volume discounts
Dedicated support
Custom SLAs
Private deployments

Get Started with Together AI

How Together AI Pricing Compares

Software	Starting Price	Top Price
Together AI	$0.1/per million tokens / hour	$9.95/per million tokens / hour
Groq	Free	$0.79/per million tokens
Fireworks AI	Free	$9/per million tokens / hour

Detailed pricing comparisons:

Together AI vs Groq

Browse all LLM API Providers pricing →

Together AI Pricing FAQ

01 How much does Together AI cost?

Together AI offers serverless inference starting at $0.10 per million tokens for small models. Mid-range models cost $0.50–1.00/M tokens, and large models like DeepSeek-R1 cost $3.00/M tokens. Dedicated GPU deployments start at $3.99/hr (1x H100) or $9.95/hr (1x B200). Batch processing saves 40–50%.

02 Does Together AI have a free tier?

Together AI does not advertise a permanent free tier or free credits on their pricing page. They offer pay-as-you-go Serverless pricing with no minimum commitment, so you only pay for what you use.

03 What models does Together AI support?

Together AI supports a wide range of open-source models including Llama, DeepSeek, Qwen, Mistral, and Kimi. They also offer image generation (FLUX, Stable Diffusion), video (Google Veo 2.0), audio transcription, text-to-speech, and embedding models.

04 Together AI vs Fireworks AI: which is cheaper?

Both offer similar serverless per-token pricing starting around $0.10/M tokens for small models. Fireworks AI gives new users $1 in free credits. For dedicated GPU hosting, Together AI's H100 is $3.99/hr versus Fireworks AI's A100 at $2.90/hr, making Fireworks slightly cheaper for dedicated compute at equivalent GPU tiers.

05 What is Together AI's Dedicated GPU pricing?

Together AI's Dedicated GPU hosting starts at $3.99/hr for a 1x H100 (single-tenant) and $9.95/hr for a 1x B200 (latest generation). Dedicated deployments are best for consistent high-volume inference where you need guaranteed resources and custom model hosting.

Is this pricing incorrect? — we verify and update within 24 hours.