Replicate Pricing 2026: Plans & Costs Compared

Price checkPer per prediction

FreeFree Pay-as-you-goFree EnterpriseCustom

All Replicate Plans & Pricing

Plan	Monthly	Annual	Best For
Free	Free	Free	Trying out AI models and small experiments
What's included at Free Best for: Trying out AI models and small experiments Free credits to get started Access to thousands of public models Pay-per-use after free credits
Pay-as-you-go	Free	Free	Developers and teams running AI predictions at any scale
What's included at Pay-as-you-go Best for: Developers and teams running AI predictions at any scale No monthly subscription fee Billed per prediction (per token, per image, or per second) Public models: from $0.003/image to $0.25/sec video Private model hardware: $0.09/hr (CPU Small) to $43.92/hr (8x H100 GPU) GPU options: T4 ($0.81/hr), L40S ($3.51/hr), A100 ($5.04/hr), H100 ($5.49/hr) Auto-scaling for private models Deploy custom models via Cog
Enterprise	Contact Sales	Contact Sales	Organizations with complex requirements or high-volume usage
What's included at Enterprise Best for: Organizations with complex requirements or high-volume usage Dedicated account manager Priority support Higher GPU limits Performance SLAs Help with onboarding, custom models, and optimizations Volume discounts for large spend

View all features by plan (compare side-by-side)

Free

Free credits to get started
Access to thousands of public models
Pay-per-use after free credits

Pay-as-you-go

No monthly subscription fee
Billed per prediction (per token, per image, or per second)
Public models: from $0.003/image to $0.25/sec video
Private model hardware: $0.09/hr (CPU Small) to $43.92/hr (8x H100 GPU)
GPU options: T4 ($0.81/hr), L40S ($3.51/hr), A100 ($5.04/hr), H100 ($5.49/hr)
Auto-scaling for private models
Deploy custom models via Cog

Enterprise

Dedicated account manager
Priority support
Higher GPU limits
Performance SLAs
Help with onboarding, custom models, and optimizations
Volume discounts for large spend

See Replicate Plans

Compare Replicate with alternativesAdjust seats, lock a tier, add up to 2 more products side-by-side. Shareable URL.

Quick Answer

Last verified: May 6, 2026

High confidence

Replicate uses custom pricing as of June 2026 with 3 plans available. Contact Replicate directly for a personalized quote. Plans: Free (free), and Pay-as-you-go (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

Replicate offers 3 pricing tiers: Free, Pay-as-you-go, Enterprise. The Pay-as-you-go plan is developers and teams running ai predictions at any scale.

Compared to other ai productivity software, Replicate is positioned at the budget-friendly price point.

4 documented hidden costs beyond list price

How much does Replicate cost?

Replicate uses custom pricing across 3 plans. Contact Replicate directly for a personalized quote. Plans include Free (free), Pay-as-you-go (free), Enterprise (custom pricing).

Replicate Pricing Overview

Replicate uses custom pricing — contact their sales team for a quote. The Free plan is free and is best for trying out ai models and small experiments. The Pay-as-you-go plan is free and is best for developers and teams running ai predictions at any scale. The Enterprise plan requires contacting sales for a custom quote and is designed for organizations with complex requirements or high-volume usage.

Replicate with a None for Pay-as-you-go minimum commitment.

There are at least 4 documented hidden costs beyond Replicate's list price, including implementation, training, and add-on fees.

This pricing was last verified in May 6, 2026 from 2 independent sources.

See Replicate Plans

Replicate is a cloud platform for running AI models via API, with the Pay-as-you-go plan charging per second of compute time and no monthly subscription fee. LLM text models start at $0.03/1M input tokens, while GPU-intensive workloads such as A100-backed inference run approximately $5/hour. Enterprise pricing is custom-quoted for teams with high or predictable compute volumes who need dedicated capacity and custom rate agreements.

How Replicate Pricing Compares

Compare Replicate pricing against top alternatives in AI Productivity.

Clockwise $7.75-$7.75/user/month Compare → Grammarly Business $12-$30/user/month Compare → Motion $19-$49/user/month Compare →

Usage-Based Rates

Per-unit pricing for Replicate API usage.

Pay-as-you-go

Model	Unit	Rate
Claude 3.7 Sonnet	1M input tokens	$3.00
Claude 3.7 Sonnet	1K output tokens	$0.015
DeepSeek R1	1M input tokens	$3.75
DeepSeek R1	1K output tokens	$0.010
FLUX 1.1 Pro (image)	image	$0.040
FLUX.1 [schnell] (image)	image	$0.00300
FLUX.1 [dev] (image)	image	$0.025
Ideogram v3 Quality (image)	image	$0.090
Recraft V3 (image)	image	$0.040
Wan 2.1 (480p video)	second	$0.090
Wan 2.1 (720p video)	second	$0.250

Public models billed per prediction (token, image, or second)
Custom/private models billed per second of hardware time
A100 GPU: $0.00140/sec; H100: $0.001525/sec

Compare Replicate vs Alternatives

Before committing to Replicate, compare pricing with these 3 alternatives in the same category.

VSClockwise

From $7.75/user/month

Individuals and small teams getting started with AI calendar management and basic scheduling

Full comparison

VSGrammarly Business

From $12/user/month

Individual users testing basic writing assistance features

Full comparison

VSMotion

From $19/user/month

Individual professionals and small teams needing AI-powered scheduling and task management

Full comparison

All Replicate alternatives & migration guides

What Companies Actually Pay for Replicate

Median per-1M-token pricing across 3 models

Input $0.060/1M

Output $0.250/1M

Flagship models in this provider's catalog

Model	Input /1M	Output /1M	Blended /1M
replicate_deepseek-v3-0324	$1.45	$1.45	$1.45
replicate_granite-4-0-h-small	$0.060	$0.250	$0.107
replicate_granite-3-3-8b-instruct	$0.030	$0.250	$0.085

Review scores

Top pricing complaints

GPU compute pricing significantly higher than raw providers like Runpod for equivalent hardwareCosts scale unexpectedly at high usage volumes with per-second billing

Source: Artificial Analysis — medians aggregated from 3 models in this provider's catalog. Per-1M-token pricing reflects list rates.

Replicate Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Image Generation Finetuning Cost Comparison For a task that requires 1 H100, Replicate charges $1/minute ($60/hr), while 8xH100s on Runpod cost just $2.88/hr - making Replicate 20x more expensive Year 1 total

$60/hr

Total For a task that requires 1 H100, Replicate charges $1/minute ($60/hr), while 8xH100s on Runpod cost just $2.88/hr - making Replicate 20x more expensive

Running image generator finetuning on Replicate's serverless versus alternatives shows the cost difference. Replicate charges $1/minute for workloads that could run on a single H100.

Audio Transcription at Scale (400 hours via Whisper) ~$70 for 400 hours of audio (~$0.0029/run per 1-minute chunk) Year 1 total

~$0.0029/run per 1-minute chunk

Total ~$70 for 400 hours of audio (~$0.0029/run per 1-minute chunk)

Transcribing 400 hours of audio using Whisper Large v2 via Replicate's Pay-as-you-go inference API, processing approximately 1-minute audio chunks as individual runs.

A100 GPU Compute Per Hour ~$5/hr per A100 Year 1 total

Running a single A100 80GB GPU instance on Replicate for model inference or fine-tuning via the Pay-as-you-go plan.

HN discussion on finetuning costs

How Replicate Pricing Compares

Software	Starting Price	Top Price
Replicate	Custom	Custom
Clockwise	Free	$7.75/user/month
Grammarly Business	Free	$30/user/month
Hugging Face	Free	$50/user/month
Motion	$29/user/month	$446/user/month
Notion AI	Free	$20/user/month

Detailed pricing comparisons:

Browse all AI Productivity pricing →

4 Replicate Hidden Costs Beyond the List Price

Beyond the listed price, Replicate has at least 4 documented hidden costs that can significantly increase total cost of ownership.

Watch for 4 hidden costs

Serverless Pricing Premium $1/minute
high 1 source

Hacker News "the pricing becomes even more astronomical; as you note, $1/minute is unreasonably expensive: that's over 20x the cost of renting 8xH100s on Runpod"
GPU Rental Markup 200-300% markup over alternatives
critical 1 source

Hacker News "Similar deal with Replicate: an A100 there is over $5/hr, whereas on Runpod it's $1.64/hr"
Managed Service Premium Over Raw GPU Compute $3-$4/hr
high 2 sources

Hacker News "renting raw compute via Runpod and friends will generally be much cheaper than renting a higher level service that uses that compute e.g. fal.ai or Replicate. For example, an A6000 on fal."
Hacker News "on Replicate today a one can get an A100 for ~$5/hr which is ... about a month."
Unpredictable Cost Growth at Scale 10-30% of license costs
medium 1 source

Hacker News "I'm a replicate user. I have experimented with LLAMA2 on the replicate and I have similar experience But you are totally correct about the pricing part it can get expensive I'm running this photo service..."

Tip

Ask your Replicate sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Intelligence sourced from 1 independent sources

Hacker News Tech community

Key claims include inline source attribution. Data verified against multiple independent sources. 10 source citations total.

Replicate Contract Terms

Replicate contracts do not auto-renew. Changes require advance notice. These terms are sourced from verified buyer experiences.

Contract Terms

Auto-Renewal No

Minimum Commitment None for Pay-as-you-go

Mid-Term Downgrade Allowed

Payment Terms Usage-based billing per second of compute time; no monthly subscription fee for standard tiers

Price Escalation No published price escalation schedule; costs track with usage volume

Note

Pay-as-you-go has no minimum commitment; usage stops when you stop running models

How to Negotiate Replicate Pricing

Replicate contracts are negotiable. These 4 tactics are sourced from real buyer experiences and procurement specialists.

Negotiation Playbook 4 tactics

Consider Raw Compute Alternatives high success

Instead of using Replicate's serverless offering, rent raw compute via Runpod or similar providers. An A100 on Runpod is $1.64/hr in Secure Cloud or $0.49/hr in Community Cloud, versus over $5/hr on Replicate.

HN discussion comparing GPU providers

Benchmark Against Raw GPU Providers Before Committing high success

Before scaling production workloads on Replicate, run the same inference workload on Runpod or Lambda Cloud to quantify the managed-service premium. Replicate's A100 pricing (~$5/hr) is approximately 3x Runpod's managed rate ($1.64/hr). Use this delta to build a business case for either negotiating an Enterprise contract or justifying the migration cost of self-hosting.

HN community comparison (2024-10-04)

Use Community Cloud for Lower Risk Workloads medium success

If you're willing to take some risk of boxes disappearing and don't need much security, Runpod's Community Cloud offers significantly cheaper rates than Replicate's managed service.

HN user comparing pricing models

Request Enterprise Pricing for Predictable High-Volume Workloads medium success

If your usage is high and predictable, contact Replicate's Enterprise team for custom pricing. Enterprise contracts on managed inference platforms typically include volume-based rate reductions that can partially close the gap with raw GPU providers while retaining the convenience of managed infrastructure.

Enterprise tier per current tier data

Full negotiation guide →

Replicate Pricing FAQ

01 How does Replicate's pricing compare to alternatives like Runpod?

Replicate is significantly more expensive than raw compute providers. An A100 GPU costs over $5/hr on Replicate versus $1.64/hr on Runpod's Secure Cloud (about 3x more). Replicate's serverless pricing can reach $1/minute, which is over 20x the cost of equivalent compute on Runpod. The premium pays for convenience and managed infrastructure, but costs add up quickly for sustained workloads.

02 Is Replicate's serverless pricing worth the cost?

Replicate's serverless model charges a significant premium over renting GPUs directly. At $1/minute for some workloads, users report this is 'unreasonably expensive' and over 20x the cost of running equivalent compute on platforms like Runpod. The convenience may be worth it for occasional use or prototyping, but actual users confirm 'the pricing part it can get expensive' for regular production workloads.

03 Is Replicate more expensive than other GPU cloud providers?

Yes, Replicate charges a managed-service premium over raw GPU providers. An A100 on Replicate costs over $5/hr, while the same GPU on Runpod's Secure Cloud runs $1.64/hr — roughly a 3x premium. The markup covers Replicate's serverless model deployment, managed infrastructure, and API abstraction, which eliminates container management overhead. If your team can manage GPU infrastructure directly, raw compute providers will be significantly cheaper at scale.

04 How does Replicate's Pay-as-you-go pricing work?

Replicate's Pay-as-you-go plan charges per second of compute time with no monthly subscription fee. The Free plan provides initial credits to get started. Once credits are exhausted, usage is billed against a credit card at per-second rates that vary by model and hardware tier. Enterprise pricing is available for teams with high or predictable compute volumes who want custom rates.

05 What are the cheapest LLM models available on Replicate?

Based on Artificial Analysis data from April 2026, the cheapest LLM on Replicate is priced at $0.03/1M input tokens and $0.25/1M output tokens. The provider median across all tracked models is $0.06 input / $0.25 output per 1M tokens. The most expensive tracked model (DeepSeek V3-0324) runs $1.45/1M for both input and output.

Is this pricing incorrect? — we'll verify and update it.