Quick Answer
Last verified:
Medium confidence

Baseten costs Free to $6.5K per month as of April 2026, with 3 plans available including a free tier. Plan: Basic (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: Yes

Baseten offers 3 pricing tiers: Basic, Pro, Enterprise. The Pro plan is teams with predictable high-volume inference needing reserved capacity.

Compared to other ai model hosting & inference software, Baseten is positioned at the premium price point.

How much does Baseten cost?

Baseten offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Basic (free), Pro (custom pricing), Enterprise (custom pricing).

Baseten Pricing Overview

Baseten has 3 pricing plans, including a free tier. Paid plans range from $0 to $6,500/month. The Basic plan is free and is best for teams getting started with model serving or running variable workloads. The Pro plan requires contacting sales for a custom quote and is designed for teams with predictable high-volume inference needing reserved capacity. The Enterprise plan requires contacting sales for a custom quote and is designed for enterprises requiring data residency, custom slas, or on-prem deployments.

This pricing was last verified in April 15, 2026 from 1 independent sources.

Baseten is a model serving platform for teams deploying custom and open-source ML models in production. It provides dedicated GPU instances billed per minute, a serverless Model API for popular open-source models billed per token, and autoscaling infrastructure. Baseten targets ML engineers who need low-latency inference without managing Kubernetes or GPU clusters.

All Baseten Plans & Pricing

Plan Monthly Annual Best For
Basic gpuAccess: All standard GPU typesbilling: Per minute, pay-as-you-go Free Custom Teams getting started with model serving or running variable workloads
Pro billing: Volume-based custom rates Contact Sales Contact Sales Teams with predictable high-volume inference needing reserved capacity
Enterprise minimumCommitment: ~$5,000/month reported Contact Sales Contact Sales Enterprises requiring data residency, custom SLAs, or on-prem deployments
View all features by plan

Basic

  • Pay-as-you-go GPU compute
  • T4 GPU from $0.63/hour
  • A10G GPU from $1.21/hour
  • A100 80GB from $4.00/hour
  • H100 80GB from $6.50/hour
  • B200 180GB from $9.98/hour
  • Model API (per-million-token billing)
  • SOC 2 Type II and HIPAA compliant
  • Email and in-app chat support
  • Fast cold starts
  • Autoscaling to zero

Pro

  • Everything in Basic
  • Volume discounts on compute
  • Priority GPU access
  • Dedicated compute reservations
  • Higher Model API rate limits
  • Hands-on engineering expertise
  • Dedicated Slack and Zoom support

Enterprise

  • Everything in Pro
  • Custom SLAs
  • Self-host (VPC/on-prem) deployments
  • On-demand flex compute
  • Use existing cloud commitments (AWS/GCP credits)
  • Full data residency control
  • Advanced security and compliance
  • Custom global regions
  • Advanced RBAC with Teams

Usage-Based Rates

Per-unit pricing for Baseten API usage.

Basic

Model Unit Rate
T4 GPU (16GB) second $0.000175 $0.63/hr — entry GPU for small models
L4 GPU (24GB) second $0.000281 $1.01/hr — efficient inference GPU
A10G GPU (24GB) second $0.000336 $1.21/hr — strong inference GPU
A100 80GB second $0.001111 $4.00/hr — large model training and inference
H100 MIG (40GB) second $0.001014 $3.65/hr — MIG slice
H100 80GB second $0.001806 $6.50/hr — top tier inference
B200 180GB second $0.002772 $9.98/hr — latest gen Blackwell GPU
  • Billed per minute (not per second) — fractions rounded up to next minute
  • Model API (per-token) rates also available for supported open-source models
  • No idle charges when deployment scales to zero

How Baseten Pricing Compares

Software Starting Price Top Price
Baseten Free $6500/month
BentoML Free $5000/month
Cerebrium Free $100/month
Banana.dev Custom Custom

Detailed pricing comparisons:

Baseten Pricing FAQ

01 How much does Baseten cost?

Baseten uses pay-as-you-go GPU pricing billed per minute. T4 GPUs start at $0.63/hour, A10G at $1.21/hour, A100 (80GB) at $4.00/hour, H100 at $6.50/hour, and B200 at $9.98/hour. The Basic plan has no monthly minimum. Pro and Enterprise offer volume discounts.

02 Does Baseten have a free tier?

New Baseten accounts receive starter credits to explore deployments at no initial cost. There is no permanently free tier — ongoing usage is pay-as-you-go or under a Pro/Enterprise contract.

03 How does Baseten billing work?

Baseten bills per minute for dedicated GPU deployments, meaning you only pay when your model is running. Model API usage (for supported open-source models) is billed per million tokens processed. There are no idle charges when deployments are scaled to zero.

04 What GPUs does Baseten support?

Baseten supports T4, L4, A10G, A100 (80GB), H100 MIG (40GB), H100 (80GB), and B200 (180GB) GPUs. GPU availability varies by plan tier, with H100 and B200 accessible on all plans at published rates.

Is this pricing incorrect? — we'll verify and update it.