Quick Answer
Last verified:
Medium confidence

Anyscale costs $0.15 to $5 per per million tokens as of April 2026, with 2 plans available. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: No free tier available

Anyscale offers 2 pricing tiers: Anyscale Endpoints, Managed Ray Clusters. The Managed Ray Clusters plan is ml teams with complex distributed training and custom ray workloads.

Compared to other llm api providers software, Anyscale is positioned at the budget-friendly price point.

  • 2 documented hidden costs beyond list price

How much does Anyscale cost?

Anyscale pricing starts at $0.15/per million tokens across 2 plans, with enterprise pricing available on request. Plans include Anyscale Endpoints (custom pricing), Managed Ray Clusters (custom pricing).

Anyscale Pricing Overview

Anyscale has 2 pricing plans ranging from $0.15 to $5/per million tokens. The Anyscale Endpoints plan requires contacting sales for a custom quote and is designed for teams needing scalable open-source llm inference with ray reliability. The Managed Ray Clusters plan requires contacting sales for a custom quote and is designed for ml teams with complex distributed training and custom ray workloads.

There are at least 2 documented hidden costs beyond Anyscale's list price, including implementation, training, and add-on fees.

This pricing was last verified in April 15, 2026 from 1 independent sources.

Anyscale is the commercial platform built on Ray, the distributed computing framework. It offers Anyscale Endpoints — a serverless LLM inference API with per-token pricing for open-source models — and managed Ray clusters for training and fine-tuning. Anyscale Endpoints are OpenAI-compatible, supporting Llama, Mistral, Mixtral, and other popular models. The platform is designed for teams that need both production-ready LLM APIs and the ability to run custom distributed workloads on Ray.

How Anyscale Pricing Compares

Compare Anyscale pricing against top alternatives in LLM API Providers.

All Anyscale Plans & Pricing

Plan Monthly Annual Best For
Anyscale Endpoints freeCredit: $10 on sign-up Custom Custom Teams needing scalable open-source LLM inference with Ray reliability
Managed Ray Clusters Contact Sales Contact Sales ML teams with complex distributed training and custom Ray workloads
View all features by plan

Anyscale Endpoints

  • OpenAI-compatible API
  • Llama 3.x models (8B, 70B, 405B)
  • Mistral and Mixtral models
  • Gemma models
  • No setup fees
  • $10 free credits on sign-up
  • Pay-as-you-go per token

Managed Ray Clusters

  • Managed Ray clusters on AWS, GCP, Azure
  • Distributed training and fine-tuning
  • Ray Tune for hyperparameter search
  • Ray Serve for model serving
  • Enterprise SLAs
  • Custom GPU configurations

Usage-Based Rates

Per-unit pricing for Anyscale API usage.

Anyscale Endpoints

Model Unit Rate
Llama 3.1 8B Instruct 1M input tokens $0.150
Llama 3.1 8B Instruct 1M output tokens $0.150
Llama 3.1 70B Instruct 1M input tokens $1.00
Llama 3.1 70B Instruct 1M output tokens $1.00
Llama 3.1 405B Instruct 1M input tokens $5.00
Llama 3.1 405B Instruct 1M output tokens $5.00
Mixtral 8x7B Instruct 1M input tokens $0.500
Mixtral 8x7B Instruct 1M output tokens $0.500
Mistral 7B Instruct 1M input tokens $0.150
Mistral 7B Instruct 1M output tokens $0.150
Gemma 7B Instruct 1M input tokens $0.150
Gemma 7B Instruct 1M output tokens $0.150
  • $10 free credits on sign-up
  • Anyscale Endpoints may have been rebranded or updated — verify at anyscale.com/endpoints
  • Rates reflect 2024-era published pricing; confirm current rates before relying on these figures
  • Note: Anyscale pivoted more toward enterprise in 2025; public endpoint pricing may vary

Compare Anyscale vs Alternatives

Before committing to Anyscale, compare pricing with these 3 alternatives in the same category.

All Anyscale alternatives & migration guides

How Anyscale Pricing Compares

Software Starting Price Top Price
Anyscale $0.15/per million tokens $5/per million tokens
Amazon Bedrock $0.07/per million tokens $75/per million tokens
Baidu ERNIE API $0.1/per million tokens $10/per million tokens
Cerebras Inference API $0.1/per million tokens $6/per million tokens
Claude API $0.03/per million tokens $75/per million tokens
Cloudflare Workers AI $0.05/per million tokens $5/per million tokens

2 Anyscale Hidden Costs Beyond the List Price

Beyond the listed price, Anyscale has at least 2 documented hidden costs that can significantly increase total cost of ownership.

Watch for 2 hidden costs
  • Ray cluster compute costs are in addition to Anyscale platform fees
  • Egress and storage costs apply for cluster-based workloads
Tip

Ask your Anyscale sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Anyscale Pricing FAQ

01 How much do Anyscale Endpoints cost?

Anyscale Endpoints charge per token. Small models like Llama 3.1 8B and Mistral 7B cost $0.15 per million tokens (input and output same rate). Llama 3.1 70B costs $1.00/M tokens. Llama 3.1 405B costs $5.00/M tokens. New accounts get $10 in free credits.

02 What is Anyscale built on?

Anyscale is built on Ray, an open-source distributed computing framework developed at UC Berkeley. Anyscale provides the managed, enterprise version of Ray with production-grade SLAs, managed clusters, and hosted inference endpoints.

03 Does Anyscale have a free tier?

Anyscale gives new accounts $10 in free credits for Endpoints usage. There is no permanently free tier — after credits are used, standard per-token rates apply.

04 Anyscale vs Together AI: which should I use?

Together AI and Anyscale both offer OpenAI-compatible inference for open-source models. Together AI has broader model selection and slightly more competitive pricing for commodity models. Anyscale is better if you're already using Ray for training or need the Ray ecosystem integration.

Is this pricing incorrect? — we'll verify and update it.