Baseten Pricing 2026: GPU Inference from $0.63/hour

Price checkMonthly

All Baseten Plans & Pricing

Plan	Monthly	Annual	Best For
Basic billing: Pay as you godeploymentOptions: Baseten	Free	Free	Teams deploying custom, fine-tuned, and open-source models on pay-as-you-go infrastructure
Verified pricing · last checked July 2026 · 3 sources Get this price at Baseten →
What's included at Basic Best for: Teams deploying custom, fine-tuned, and open-source models on pay-as-you-go infrastructure Dedicated deployments Model APIs Training Fast cold starts SOC 2 Type II and HIPAA compliant Email and in-app chat support Pay-as-you-go compute Limits billingPay as you go deploymentOptionsBaseten
Pro billing: Volume discounts availabledeploymentOptions: Baseten	Contact Sales	Contact Sales	Teams needing unlimited autoscaling, priority compute access, and higher API rate limits
Verified pricing · last checked July 2026 · 3 sources Get this price at Baseten →
What's included at Pro Best for: Teams needing unlimited autoscaling, priority compute access, and higher API rate limits Everything in Basic Priority access to high-demand GPUs Dedicated compute Higher Model API rate limits Hands-on engineering expertise Dedicated support on Slack and Zoom Volume discounts available Limits billingVolume discounts available deploymentOptionsBaseten
Enterprise billing: Volume discounts availabledeploymentOptions: Baseten, Your VPC, Hybrid	Contact Sales	Contact Sales	Enterprises requiring full control in Baseten cloud, their own VPC, or hybrid deployments
Verified pricing · last checked July 2026 · 3 sources Get this price at Baseten →
What's included at Enterprise Best for: Enterprises requiring full control in Baseten cloud, their own VPC, or hybrid deployments Everything in Pro Custom SLAs Self-host deployments On-demand flex compute Use existing cloud commitments Full control over data residency Advanced security and compliance Custom global regions Advanced RBAC with Teams Volume discounts available Limits billingVolume discounts available deploymentOptionsBaseten, Your VPC, Hybrid

View all features by plan (compare side-by-side)

Basic

Dedicated deployments
Model APIs
Training
Fast cold starts
SOC 2 Type II and HIPAA compliant
Email and in-app chat support
Pay-as-you-go compute

Pro

Everything in Basic
Priority access to high-demand GPUs
Dedicated compute
Higher Model API rate limits
Hands-on engineering expertise
Dedicated support on Slack and Zoom
Volume discounts available

Enterprise

Everything in Pro
Custom SLAs
Self-host deployments
On-demand flex compute
Use existing cloud commitments
Full control over data residency
Advanced security and compliance
Custom global regions
Advanced RBAC with Teams
Volume discounts available

Pricing Alerts

Track Baseten pricing

Get an email when Baseten's pricing changes — plus the weekly SaaS Price Watch: verified price changes and deals across 3,000+ products. One-click unsubscribe.

Start Baseten Free Trial

Compare Baseten with alternativesAdjust seats, lock a tier, add up to 2 more products side-by-side. Shareable URL.

Quick Answer

Last verified: July 27, 2026

High confidence

Baseten uses custom pricing as of July 2026 with 3 plans available. Contact Baseten directly for a personalized quote. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams needing unlimited autoscaling, priority compute access, and higher api rate limits.

Compared to other ai model hosting & inference software, Baseten is positioned at the budget-friendly price point.

1 documented hidden costs beyond list price

How much does Baseten cost?

Baseten uses custom pricing across 3 plans. Contact Baseten directly for a personalized quote. Plans include Basic (free), Pro (custom pricing), Enterprise (custom pricing).

Baseten Pricing Overview

Baseten uses custom pricing — contact their sales team for a quote. The Basic plan is free and is best for teams deploying custom, fine-tuned, and open-source models on pay-as-you-go infrastructure. The Pro plan requires contacting sales for a custom quote and is designed for teams needing unlimited autoscaling, priority compute access, and higher api rate limits. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring full control in baseten cloud, their own vpc, or hybrid deployments.

There are at least 1 documented hidden costs beyond Baseten's list price, including implementation, training, and add-on fees.

This pricing was last verified in July 27, 2026 from 3 independent sources.

Start Baseten Free Trial

Baseten is a model inference platform offering a free Basic plan with starter credits, plus custom-priced Pro and Enterprise tiers for production workloads. Token-based API pricing varies by model — across 6 models tracked by Artificial Analysis, median rates are $0.60 per million input tokens and $2.20 per million output tokens as of April 2026. Large frontier model deployments requiring dedicated GPU infrastructure are Enterprise-only and require a custom sales quote.

How Baseten Pricing Compares

Compare Baseten pricing against top alternatives in AI Model Hosting & Inference.

BentoML $0-$5000/month Compare → Cerebrium $0-$100/month Compare → Banana.dev Custom pricing Compare →

Usage-Based Rates

Per-unit pricing for Baseten API usage.

Basic

Model	Input	Output	Cached	Per
glm-5-2-fast	$2.10	$6.60	$0.210	1M tokens
inkling	$1.00	$4.05	$0.170	1M tokens
glm-5-2	$1.40	$4.40	$0.140	1M tokens
glm-4-7	$0.600	$2.20	$0.120	1M tokens
kimi-k2-7-code	$0.950	$4.00	$0.160	1M tokens
kimi-k2-6	$0.950	$4.00	$0.160	1M tokens
nvidia-nemotron-3-ultra	$0.600	$2.40	$0.120	1M tokens
deepseek-v4	$1.74	$3.48	$0.145	1M tokens
gpt-oss-120b	$0.100	$0.500	—	1M tokens

Model / SKU	Unit	Price
t4-16gb	minute	$0.011
l4-24gb	minute	$0.014
a10g-24gb	minute	$0.020
a100-80gb	minute	$0.067
h100-mig-40gb	minute	$0.063
h100-80gb	minute	$0.108
b200-180gb	minute	$0.166
cpu-1x2	minute	$0.00058
cpu-1x4	minute	$0.00086
cpu-2x8	minute	$0.00173
cpu-4x16	minute	$0.00346
cpu-8x32	minute	$0.00691
cpu-16x64	minute	$0.014

Dedicated deployments and training are priced per minute.
Volume discounts are available.
Talk to sales for compute in other countries and regions.

Compare Baseten vs Alternatives

Before committing to Baseten, compare pricing with these 3 alternatives in the same category.

VSBentoML

Free

Individual developers and small teams building AI-powered APIs

Full comparison

VSCerebrium

Free

Individual developers and hobbyists experimenting with serverless ML inference

Full comparison

VSBanana.dev

Custom

Historical reference only — service is no longer available

Full comparison

All Baseten alternatives & migration guides

What Companies Actually Pay for Baseten

Median per-1M-token pricing across 6 models

Input $0.600/1M

Output $2.20/1M

Flagship models in this provider's catalog

Model	Input /1M	Output /1M	Blended /1M
baseten_glm-5	$0.950	$3.15	$1.50
baseten_glm-5-non-reasoning	$0.950	$3.15	$1.50
baseten_glm-4-7	$0.600	$2.20	$1.00
baseten_glm-4-7-non-reasoning	$0.600	$2.20	$1.00
baseten_deepseek-v3-1_fp8	$0.500	$1.50	$0.750

Review scores

Third-party review aggregates, as of Apr 2026

Top pricing complaints

Large model pricing requires contacting sales with no transparent rates publishedNo fixed monthly pricing — all compute costs are metered and variable

Source: Artificial Analysis — medians aggregated from 6 models in this provider's catalog. Per-1M-token pricing reflects list rates.

Baseten Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Enterprise Frontier Model Hosting (H200-scale) Estimated $100,000–$500,000+/year (community estimate; custom quote required) Year 1 total

community estimate; custom quote required

Total Estimated $100,000–$500,000+/year (community estimate; custom quote required)

Hosting a large frontier model such as DeepSeek R1 that requires multiple H200 GPUs via Baseten's Enterprise plan. Requires a custom sales quote; no public pricing available.

Reddit community estimate (r/Clojurescript, 2025-01-22)

How Baseten Pricing Compares

Software	Starting Price	Top Price
Baseten	Custom	Custom
Banana.dev	Custom	Custom
BentoML	Free	$5000/month
Cerebrium	Free	$100/month
Banana.dev (rebranded)	$1200/mo + at-cost compute	$1200/mo + at-cost compute
Inference.net	Free	$250/forever

Detailed pricing comparisons:

Browse all AI Model Hosting & Inference pricing →

1 Baseten Hidden Costs Beyond the List Price

Beyond the listed price, Baseten has at least 1 documented hidden costs that can significantly increase total cost of ownership.

Watch for 1 hidden costs

GPU Infrastructure Costs for Large-Scale Model Deployments $100,000-$500,000
critical 1 source

Reddit "since it requires many H200's, I'm guessing the cost is in the multiple hundreds of thousands per year"
Reddit "the pricing on their website (https://www.baseten.co/library/deepseek-r1) just has a "call sales" button, which is never a good sign"

Tip

Ask your Baseten sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Intelligence sourced from 1 independent sources

Reddit User discussions

Key claims include inline source attribution. Data verified against multiple independent sources. 4 source citations total.

How to Negotiate Baseten Pricing

Baseten contracts are negotiable. These 1 tactics are sourced from real buyer experiences and procurement specialists.

Negotiation Playbook 1 tactics

Engage Sales Early for Large Model Deployments medium success

For frontier-scale models requiring dedicated GPU infrastructure (e.g. DeepSeek R1, large parameter models needing multiple H200s), Baseten publishes no pricing — only a 'contact sales' CTA. Reach out early in your evaluation to get a custom quote and negotiate volume commitments in exchange for cost guarantees or reserved capacity.

Reddit (r/Clojurescript, 2025-01-22)

Full negotiation guide →

Baseten Pricing FAQ

01 How much does Baseten cost?

Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.

02 Does Baseten have a free tier?

New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.

03 How does Baseten billing work?

Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.

04 What GPUs does Baseten support?

Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.

05 Does Baseten offer a fixed monthly pricing plan?

No. Baseten operates on a pay-per-use model — there is no fixed monthly cap. The Basic plan provides starter credits at no initial cost, while Pro and Enterprise are custom-priced based on usage and infrastructure requirements. All compute is metered.

06 How much does it cost to host large models like DeepSeek R1 on Baseten?

Large frontier models requiring multiple H200 GPUs are priced via custom Enterprise agreements only — no public rates are listed. Community estimates suggest such deployments can cost hundreds of thousands of dollars per year. Contact Baseten sales for a formal quote.

07 What is Baseten's typical per-token pricing?

Based on Artificial Analysis data (April 2026), Baseten's median pricing across 6 tracked models is $0.60 per million input tokens and $2.20 per million output tokens. Individual model prices range from $0.10/1M input (gpt-oss-120b-low) to $0.95/1M input (GLM-5 at $3.15/1M output).

Is this pricing incorrect? — we'll verify and update it.