Anyscale Pricing 2026
Complete pricing guide with plans, hidden costs, and cost analysis
Anyscale pricing ranges from $0.15 to $5/per million tokens.
Anyscale costs $0.15 to $5 per per million tokens as of April 2026, with 2 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: No free tier available
Anyscale offers 2 pricing tiers: Anyscale Endpoints, Managed Ray Clusters. The Managed Ray Clusters plan is ml teams with complex distributed training and custom ray workloads.
Compared to other llm api providers software, Anyscale is positioned at the budget-friendly price point.
- 2 documented hidden costs beyond list price
How much does Anyscale cost?
Anyscale Pricing Overview
Anyscale has 2 pricing plans ranging from $0.15 to $5/per million tokens. The Anyscale Endpoints plan requires contacting sales for a custom quote and is designed for teams needing scalable open-source llm inference with ray reliability. The Managed Ray Clusters plan requires contacting sales for a custom quote and is designed for ml teams with complex distributed training and custom ray workloads.
There are at least 2 documented hidden costs beyond Anyscale's list price, including implementation, training, and add-on fees.
This pricing was last verified in April 15, 2026 from 1 independent sources.
Anyscale is the commercial platform built on Ray, the distributed computing framework. It offers Anyscale Endpoints — a serverless LLM inference API with per-token pricing for open-source models — and managed Ray clusters for training and fine-tuning. Anyscale Endpoints are OpenAI-compatible, supporting Llama, Mistral, Mixtral, and other popular models. The platform is designed for teams that need both production-ready LLM APIs and the ability to run custom distributed workloads on Ray.
How Anyscale Pricing Compares
Compare Anyscale pricing against top alternatives in LLM API Providers.
All Anyscale Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Anyscale Endpoints freeCredit: $10 on sign-up | Custom | Custom | Teams needing scalable open-source LLM inference with Ray reliability |
| Managed Ray Clusters | Contact Sales | Contact Sales | ML teams with complex distributed training and custom Ray workloads |
View all features by plan
Anyscale Endpoints
- OpenAI-compatible API
- Llama 3.x models (8B, 70B, 405B)
- Mistral and Mixtral models
- Gemma models
- No setup fees
- $10 free credits on sign-up
- Pay-as-you-go per token
Managed Ray Clusters
- Managed Ray clusters on AWS, GCP, Azure
- Distributed training and fine-tuning
- Ray Tune for hyperparameter search
- Ray Serve for model serving
- Enterprise SLAs
- Custom GPU configurations
Usage-Based Rates
Per-unit pricing for Anyscale API usage.
Anyscale Endpoints
| Model | Unit | Rate |
|---|---|---|
| Llama 3.1 8B Instruct | 1M input tokens | $0.150 |
| Llama 3.1 8B Instruct | 1M output tokens | $0.150 |
| Llama 3.1 70B Instruct | 1M input tokens | $1.00 |
| Llama 3.1 70B Instruct | 1M output tokens | $1.00 |
| Llama 3.1 405B Instruct | 1M input tokens | $5.00 |
| Llama 3.1 405B Instruct | 1M output tokens | $5.00 |
| Mixtral 8x7B Instruct | 1M input tokens | $0.500 |
| Mixtral 8x7B Instruct | 1M output tokens | $0.500 |
| Mistral 7B Instruct | 1M input tokens | $0.150 |
| Mistral 7B Instruct | 1M output tokens | $0.150 |
| Gemma 7B Instruct | 1M input tokens | $0.150 |
| Gemma 7B Instruct | 1M output tokens | $0.150 |
- $10 free credits on sign-up
- Anyscale Endpoints may have been rebranded or updated — verify at anyscale.com/endpoints
- Rates reflect 2024-era published pricing; confirm current rates before relying on these figures
- Note: Anyscale pivoted more toward enterprise in 2025; public endpoint pricing may vary
Compare Anyscale vs Alternatives
Before committing to Anyscale, compare pricing with these 3 alternatives in the same category.
China-market apps and Chinese-first workloads
Full comparisonTesting Cerebras's unique speed advantage
Full comparisonLong-context (1M tokens) and Chinese-language apps
Full comparisonHow Anyscale Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Anyscale | $0.15/per million tokens | $5/per million tokens |
| Amazon Bedrock | $0.07/per million tokens | $75/per million tokens |
| Baidu ERNIE API | $0.1/per million tokens | $10/per million tokens |
| Cerebras Inference API | $0.1/per million tokens | $6/per million tokens |
| Claude API | $0.03/per million tokens | $75/per million tokens |
| Cloudflare Workers AI | $0.05/per million tokens | $5/per million tokens |
Detailed pricing comparisons:
Anyscale Pricing FAQ
01 How much do Anyscale Endpoints cost?
Anyscale Endpoints charge per token. Small models like Llama 3.1 8B and Mistral 7B cost $0.15 per million tokens (input and output same rate). Llama 3.1 70B costs $1.00/M tokens. Llama 3.1 405B costs $5.00/M tokens. New accounts get $10 in free credits.
02 What is Anyscale built on?
Anyscale is built on Ray, an open-source distributed computing framework developed at UC Berkeley. Anyscale provides the managed, enterprise version of Ray with production-grade SLAs, managed clusters, and hosted inference endpoints.
03 Does Anyscale have a free tier?
Anyscale gives new accounts $10 in free credits for Endpoints usage. There is no permanently free tier — after credits are used, standard per-token rates apply.
04 Anyscale vs Together AI: which should I use?
Together AI and Anyscale both offer OpenAI-compatible inference for open-source models. Together AI has broader model selection and slightly more competitive pricing for commodity models. Anyscale is better if you're already using Ray for training or need the Ray ecosystem integration.
Is this pricing incorrect? — we'll verify and update it.