Quick Answer
Last verified:
Medium confidence

Together AI costs $0.10 to $9.95 per per million tokens / hour as of April 2026, with 4 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: No free tier available

Together AI offers 4 pricing tiers: Serverless, Dedicated (1x H100), Dedicated (1x B200), Enterprise. The Dedicated (1x H100) plan is consistent high-volume inference.

Compared to other llm api providers software, Together AI is positioned at the budget-friendly price point.

How much does Together AI cost?

Together AI pricing starts at $0.10/per million tokens / hour across 4 plans, with enterprise pricing available on request. Plans include Serverless (custom pricing), Dedicated (1x H100) (custom pricing), Dedicated (1x B200) (custom pricing), Enterprise (custom pricing).

Together AI Pricing Overview

Together AI has 4 pricing plans ranging from $0.10 to $9.95/per million tokens / hour. The Serverless plan requires contacting sales for a custom quote and is designed for variable-volume api usage. The Dedicated (1x H100) plan requires contacting sales for a custom quote and is designed for consistent high-volume inference. The Dedicated (1x B200) plan requires contacting sales for a custom quote and is designed for high-performance dedicated inference. The Enterprise plan requires contacting sales for a custom quote and is designed for large-scale enterprise deployments.

This pricing was last verified in April 1, 2026.

Together AI provides serverless LLM inference and dedicated GPU hosting, with serverless pricing starting at $0.10 per million tokens for small models and scaling to $3.00+ per million tokens for large models. Dedicated GPU deployments are available starting at $3.99/hour for a 1x H100 and $9.95/hour for a 1x B200. Batch API processing offers 40–50% discounts over standard serverless rates.

All Together AI Plans & Pricing

Plan Monthly Annual Best For
Serverless Contact Sales Contact Sales Variable-volume API usage
Dedicated (1x H100) Contact Sales Contact Sales Consistent high-volume inference
Dedicated (1x B200) Contact Sales Contact Sales High-performance dedicated inference
Enterprise Contact Sales Contact Sales Large-scale enterprise deployments
View all features by plan

Serverless

  • Pay-as-you-go per-token pricing
  • Budget models from $0.10/M tokens
  • Mid-range models from $0.50/M tokens
  • Large models from $3.00/M tokens
  • Batch API with 40-50% discount
  • Image generation from $0.0006/image
  • Embeddings at $0.02/M tokens

Dedicated (1x H100)

  • Single-tenant GPU deployment
  • 1x H100 at $3.99/hr
  • Custom model hosting
  • Dedicated resources

Dedicated (1x B200)

  • Single-tenant GPU deployment
  • 1x B200 at $9.95/hr
  • Latest generation hardware
  • Dedicated resources

Enterprise

  • Volume discounts
  • Dedicated support
  • Custom SLAs
  • Private deployments

How Together AI Pricing Compares

Software Starting Price Top Price
Together AI $0.1/per million tokens / hour $9.95/per million tokens / hour
Groq Free $0.79/per million tokens
Fireworks AI Free $9/per million tokens / hour

Detailed pricing comparisons:

Together AI Pricing FAQ

01 How much does Together AI cost?

Together AI offers serverless inference starting at $0.10 per million tokens for small models. Mid-range models cost $0.50–1.00/M tokens, and large models like DeepSeek-R1 cost $3.00/M tokens. Dedicated GPU deployments start at $3.99/hr (1x H100) or $9.95/hr (1x B200). Batch processing saves 40–50%.

02 Does Together AI have a free tier?

Together AI does not advertise a permanent free tier or free credits on their pricing page. They offer pay-as-you-go Serverless pricing with no minimum commitment, so you only pay for what you use.

03 What models does Together AI support?

Together AI supports a wide range of open-source models including Llama, DeepSeek, Qwen, Mistral, and Kimi. They also offer image generation (FLUX, Stable Diffusion), video (Google Veo 2.0), audio transcription, text-to-speech, and embedding models.

04 Together AI vs Fireworks AI: which is cheaper?

Both offer similar serverless per-token pricing starting around $0.10/M tokens for small models. Fireworks AI gives new users $1 in free credits. For dedicated GPU hosting, Together AI's H100 is $3.99/hr versus Fireworks AI's A100 at $2.90/hr, making Fireworks slightly cheaper for dedicated compute at equivalent GPU tiers.

05 What is Together AI's Dedicated GPU pricing?

Together AI's Dedicated GPU hosting starts at $3.99/hr for a 1x H100 (single-tenant) and $9.95/hr for a 1x B200 (latest generation). Dedicated deployments are best for consistent high-volume inference where you need guaranteed resources and custom model hosting.

Is this pricing incorrect? — we verify and update within 24 hours.