Baseten Pricing 2026
Complete pricing guide with plans, and cost analysis
Baseten pricing ranges from $0 to $6500/month.
Baseten costs Free to $6.5K per month as of April 2026, with 3 plans available including a free tier. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams with predictable high-volume inference needing reserved capacity.
Compared to other ai model hosting & inference software, Baseten is positioned at the premium price point.
How much does Baseten cost?
Baseten Pricing Overview
Baseten has 3 pricing plans, including a free tier. Paid plans range from $0 to $6,500/month. The Basic plan is free and is best for teams getting started with model serving or running variable workloads. The Pro plan requires contacting sales for a custom quote and is designed for teams with predictable high-volume inference needing reserved capacity. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring data residency, custom slas, or on-prem deployments.
This pricing was last verified in April 15, 2026 from 1 independent sources.
Baseten is a model serving platform for teams deploying custom and open-source ML models in production. It provides dedicated GPU instances billed per minute, a serverless Model API for popular open-source models billed per token, and autoscaling infrastructure. Baseten targets ML engineers who need low-latency inference without managing Kubernetes or GPU clusters.
All Baseten Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Basic gpuAccess: All standard GPU typesbilling: Per minute, pay-as-you-go | Free | Custom | Teams getting started with model serving or running variable workloads |
| Pro billing: Volume-based custom rates | Contact Sales | Contact Sales | Teams with predictable high-volume inference needing reserved capacity |
| Enterprise minimumCommitment: ~$5,000/month reported | Contact Sales | Contact Sales | Enterprises requiring data residency, custom SLAs, or on-prem deployments |
View all features by plan
Basic
- Pay-as-you-go GPU compute
- T4 GPU from $0.63/hour
- A10G GPU from $1.21/hour
- A100 80GB from $4.00/hour
- H100 80GB from $6.50/hour
- B200 180GB from $9.98/hour
- Model API (per-million-token billing)
- SOC 2 Type II and HIPAA compliant
- Email and in-app chat support
- Fast cold starts
- Autoscaling to zero
Pro
- Everything in Basic
- Volume discounts on compute
- Priority GPU access
- Dedicated compute reservations
- Higher Model API rate limits
- Hands-on engineering expertise
- Dedicated Slack and Zoom support
Enterprise
- Everything in Pro
- Custom SLAs
- Self-host (VPC/on-prem) deployments
- On-demand flex compute
- Use existing cloud commitments (AWS/GCP credits)
- Full data residency control
- Advanced security and compliance
- Custom global regions
- Advanced RBAC with Teams
Usage-Based Rates
Per-unit pricing for Baseten API usage.
Basic
| Model | Unit | Rate |
|---|---|---|
| T4 GPU (16GB) | second | $0.000175 $0.63/hr — entry GPU for small models |
| L4 GPU (24GB) | second | $0.000281 $1.01/hr — efficient inference GPU |
| A10G GPU (24GB) | second | $0.000336 $1.21/hr — strong inference GPU |
| A100 80GB | second | $0.001111 $4.00/hr — large model training and inference |
| H100 MIG (40GB) | second | $0.001014 $3.65/hr — MIG slice |
| H100 80GB | second | $0.001806 $6.50/hr — top tier inference |
| B200 180GB | second | $0.002772 $9.98/hr — latest gen Blackwell GPU |
- Billed per minute (not per second) — fractions rounded up to next minute
- Model API (per-token) rates also available for supported open-source models
- No idle charges when deployment scales to zero
How Baseten Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Baseten | Free | $6500/month |
| BentoML | Free | $5000/month |
| Cerebrium | Free | $100/month |
| Banana.dev | Custom | Custom |
Detailed pricing comparisons:
Baseten Pricing FAQ
01 How much does Baseten cost?
Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.
02 Does Baseten have a free tier?
New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.
03 How does Baseten billing work?
Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.
04 What GPUs does Baseten support?
Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.
Is this pricing incorrect? — we'll verify and update it.