Replicate Pricing 2026
Complete pricing guide with plans, hidden costs, and cost analysis
Replicate uses custom pricing — contact their sales team for a quote.
Replicate uses custom pricing as of May 2026 with 3 plans available. Contact Replicate directly for a personalized quote. Plans: Free (free), and Pay-as-you-go (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
Replicate offers 3 pricing tiers: Free, Pay-as-you-go, Enterprise. The Pay-as-you-go plan is developers and teams running ai predictions at any scale.
Compared to other ai productivity software, Replicate is positioned at the budget-friendly price point.
- 4 documented hidden costs beyond list price
How much does Replicate cost?
Replicate Pricing Overview
Replicate uses custom pricing — contact their sales team for a quote. The Free plan is free and is best for trying out ai models and small experiments. The Pay-as-you-go plan is free and is best for developers and teams running ai predictions at any scale. The Enterprise plan requires contacting sales for a custom quote and is designed for organizations with complex requirements or high-volume usage.
Replicate with a None for Pay-as-you-go minimum commitment.
There are at least 4 documented hidden costs beyond Replicate's list price, including implementation, training, and add-on fees.
This pricing was last verified in May 6, 2026 from 2 independent sources.
Replicate is a cloud platform for running AI models via API, with the Pay-as-you-go plan charging per second of compute time and no monthly subscription fee. LLM text models start at $0.03/1M input tokens, while GPU-intensive workloads such as A100-backed inference run approximately $5/hour. Enterprise pricing is custom-quoted for teams with high or predictable compute volumes who need dedicated capacity and custom rate agreements.
How Replicate Pricing Compares
Compare Replicate pricing against top alternatives in AI Productivity.
All Replicate Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Free | Free | Custom | Trying out AI models and small experiments |
| Pay-as-you-go | Free | Custom | Developers and teams running AI predictions at any scale |
| Enterprise | Contact Sales | Contact Sales | Organizations with complex requirements or high-volume usage |
View all features by plan
Free
- Free credits to get started
- Access to thousands of public models
- Pay-per-use after free credits
Pay-as-you-go
- No monthly subscription fee
- Billed per prediction (per token, per image, or per second)
- Public models: from $0.003/image to $0.25/sec video
- Private model hardware: $0.09/hr (CPU Small) to $43.92/hr (8x H100 GPU)
- GPU options: T4 ($0.81/hr), L40S ($3.51/hr), A100 ($5.04/hr), H100 ($5.49/hr)
- Auto-scaling for private models
- Deploy custom models via Cog
Enterprise
- Dedicated account manager
- Priority support
- Higher GPU limits
- Performance SLAs
- Help with onboarding, custom models, and optimizations
- Volume discounts for large spend
Usage-Based Rates
Per-unit pricing for Replicate API usage.
Pay-as-you-go
| Model | Unit | Rate |
|---|---|---|
| Claude 3.7 Sonnet | 1M input tokens | $3.00 |
| Claude 3.7 Sonnet | 1K output tokens | $0.015 |
| DeepSeek R1 | 1M input tokens | $3.75 |
| DeepSeek R1 | 1K output tokens | $0.010 |
| FLUX 1.1 Pro (image) | image | $0.040 |
| FLUX.1 [schnell] (image) | image | $0.00300 |
| FLUX.1 [dev] (image) | image | $0.025 |
| Ideogram v3 Quality (image) | image | $0.090 |
| Recraft V3 (image) | image | $0.040 |
| Wan 2.1 (480p video) | second | $0.090 |
| Wan 2.1 (720p video) | second | $0.250 |
- Public models billed per prediction (token, image, or second)
- Custom/private models billed per second of hardware time
- A100 GPU: $0.00140/sec; H100: $0.001525/sec
Compare Replicate vs Alternatives
Before committing to Replicate, compare pricing with these 3 alternatives in the same category.
Individuals and small teams getting started with AI calendar management and basic scheduling
Full comparisonIndividual users testing basic writing assistance features
Full comparisonIndividual professionals and small teams needing AI-powered scheduling and task management
Full comparisonWhat Companies Actually Pay for Replicate
| Model | Input /1M | Output /1M | Blended /1M |
|---|---|---|---|
| replicate_deepseek-v3-0324 | $1.45 | $1.45 | $1.45 |
| replicate_granite-4-0-h-small | $0.060 | $0.250 | $0.107 |
| replicate_granite-3-3-8b-instruct | $0.030 | $0.250 | $0.085 |
Replicate Year 1 Total Cost by Company Size
Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.
Running image generator finetuning on Replicate's serverless versus alternatives shows the cost difference. Replicate charges $1/minute for workloads that could run on a single H100.
Transcribing 400 hours of audio using Whisper Large v2 via Replicate's Pay-as-you-go inference API, processing approximately 1-minute audio chunks as individual runs.
Running a single A100 80GB GPU instance on Replicate for model inference or fine-tuning via the Pay-as-you-go plan.
HN discussion on finetuning costs
How Replicate Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| Replicate | Custom | Custom |
| Clockwise | Free | $7.75/user/month |
| Grammarly Business | Free | $30/user/month |
| Motion | $29/user/month | $446/user/month |
| Notion AI | Free | $18/user/month |
| OpenAI | Free | $200/month |
Detailed pricing comparisons:
Replicate Contract Terms
Replicate contracts do not auto-renew. Changes require advance notice. These terms are sourced from verified buyer experiences.
Pay-as-you-go has no minimum commitment; usage stops when you stop running models
How to Negotiate Replicate Pricing
Replicate contracts are negotiable. These 4 tactics are sourced from real buyer experiences and procurement specialists.
Instead of using Replicate's serverless offering, rent raw compute via Runpod or similar providers. An A100 on Runpod is $1.64/hr in Secure Cloud or $0.49/hr in Community Cloud, versus over $5/hr on Replicate.
HN discussion comparing GPU providersBefore scaling production workloads on Replicate, run the same inference workload on Runpod or Lambda Cloud to quantify the managed-service premium. Replicate's A100 pricing (~$5/hr) is approximately 3x Runpod's managed rate ($1.64/hr). Use this delta to build a business case for either negotiating an Enterprise contract or justifying the migration cost of self-hosting.
HN community comparison (2024-10-04)If you're willing to take some risk of boxes disappearing and don't need much security, Runpod's Community Cloud offers significantly cheaper rates than Replicate's managed service.
HN user comparing pricing modelsIf your usage is high and predictable, contact Replicate's Enterprise team for custom pricing. Enterprise contracts on managed inference platforms typically include volume-based rate reductions that can partially close the gap with raw GPU providers while retaining the convenience of managed infrastructure.
Enterprise tier per current tier dataReplicate Pricing FAQ
01 How does Replicate's pricing compare to alternatives like Runpod?
Replicate is significantly more expensive than raw compute providers. An A100 GPU costs over $5/hr on Replicate versus $1.64/hr on Runpod's Secure Cloud (about 3x more). Replicate's serverless pricing can reach $1/minute, which is over 20x the cost of equivalent compute on Runpod. The premium pays for convenience and managed infrastructure, but costs add up quickly for sustained workloads.
02 Is Replicate's serverless pricing worth the cost?
Replicate's serverless model charges a significant premium over renting GPUs directly. At $1/minute for some workloads, users report this is 'unreasonably expensive' and over 20x the cost of running equivalent compute on platforms like Runpod. The convenience may be worth it for occasional use or prototyping, but actual users confirm 'the pricing part it can get expensive' for regular production workloads.
03 Is Replicate more expensive than other GPU cloud providers?
Yes, Replicate charges a managed-service premium over raw GPU providers. An A100 on Replicate costs over $5/hr, while the same GPU on Runpod's Secure Cloud runs $1.64/hr — roughly a 3x premium. The markup covers Replicate's serverless model deployment, managed infrastructure, and API abstraction, which eliminates container management overhead. If your team can manage GPU infrastructure directly, raw compute providers will be significantly cheaper at scale.
04 How does Replicate's Pay-as-you-go pricing work?
Replicate's Pay-as-you-go plan charges per second of compute time with no monthly subscription fee. The Free plan provides initial credits to get started. Once credits are exhausted, usage is billed against a credit card at per-second rates that vary by model and hardware tier. Enterprise pricing is available for teams with high or predictable compute volumes who want custom rates.
05 What are the cheapest LLM models available on Replicate?
Based on Artificial Analysis data from April 2026, the cheapest LLM on Replicate is priced at $0.03/1M input tokens and $0.25/1M output tokens. The provider median across all tracked models is $0.06 input / $0.25 output per 1M tokens. The most expensive tracked model (DeepSeek V3-0324) runs $1.45/1M for both input and output.
Is this pricing incorrect? — we'll verify and update it.