NVIDIA NIM Pricing 2026
Complete pricing guide with plans, and cost analysis
NVIDIA NIM pricing ranges from $0.10 to $10/per million tokens.
NVIDIA NIM costs $0.10 to $10 per per million tokens as of April 2026, with 3 plans available including a free tier. Plan: Developer (Free credits) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.
Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.
- Free tier: Yes
NVIDIA NIM offers 3 pricing tiers: Developer (Free credits), Pay-as-you-go (hosted NIM endpoints), Enterprise (AI Enterprise license + DGX Cloud). The Pay-as-you-go (hosted NIM endpoints) plan is production inference on open models.
Compared to other llm api providers software, NVIDIA NIM is positioned at the budget-friendly price point.
- 0
How much does NVIDIA NIM cost?
NVIDIA NIM Pricing Overview
NVIDIA NIM has 3 pricing plans, including a free tier. Paid plans range from $0.10 to $10/per million tokens. The Developer (Free credits) plan is free and is best for prototyping and evaluation. The Pay-as-you-go (hosted NIM endpoints) plan requires contacting sales for a custom quote and is designed for production inference on open models. The Enterprise (AI Enterprise license + DGX Cloud) plan requires contacting sales for a custom quote and is designed for regulated enterprise inference workloads.
This pricing was last verified in April 23, 2026.
NVIDIA NIM pricing starts at $0/month on the Developer (Free credits) plan, giving developers API access to hosted NIM microservice endpoints. The Pay-as-you-go (hosted NIM endpoints) tier charges per token consumed, with hosted model rates on OpenRouter ranging from $0.04–$1.20 per million input tokens depending on model size. Organizations requiring enterprise deployment on DGX Cloud or on-premises infrastructure are served by the Enterprise (AI Enterprise license + DGX Cloud) plan, which is custom-quoted.
How NVIDIA NIM Pricing Compares
Compare NVIDIA NIM pricing against top alternatives in LLM API Providers.
All NVIDIA NIM Plans & Pricing
| Plan | Monthly | Annual | Best For |
|---|---|---|---|
| Developer (Free credits) | Free | Free | Prototyping and evaluation |
| Pay-as-you-go (hosted NIM endpoints) | Custom | Custom | Production inference on open models |
| Enterprise (AI Enterprise license + DGX Cloud) | Contact Sales | Contact Sales | Regulated enterprise inference workloads |
View all features by plan
Developer (Free credits)
- Free API credits on build.nvidia.com
- Access to 50+ hosted models
- NVIDIA, Meta, Mistral, Microsoft open models
Pay-as-you-go (hosted NIM endpoints)
- Llama 3.3 70B: $0.90/1M tokens blended
- Mixtral 8x22B: $1.20/1M
- Nemotron 70B: $0.90/1M
- Hosted behind NVIDIA inference infra
Enterprise (AI Enterprise license + DGX Cloud)
- AI Enterprise subscription
- DGX Cloud dedicated capacity
- On-prem deployment support
Usage-Based Rates
Per-unit pricing for NVIDIA NIM API usage.
Pay-as-you-go (hosted NIM endpoints)
| Model | Unit | Rate |
|---|---|---|
| llama-3-3-70b-nim 131072 ctx | 1M tokens | $0.900 |
| mixtral-8x22b-nim 65536 ctx | 1M tokens | $1.20 |
| nemotron-70b 131072 ctx | 1M tokens | $0.900 |
- NIM = NVIDIA Inference Microservice. Same pricing on cloud-hosted + on-prem DGX deployments.
Compare NVIDIA NIM vs Alternatives
Before committing to NVIDIA NIM, compare pricing with these 3 alternatives in the same category.
What Companies Actually Pay for NVIDIA NIM
| Model | Input /1M | Output /1M | Blended /1M |
|---|---|---|---|
| nvidia/llama-3.1-nemotron-70b-instruct | $1.20 | $1.20 | $1.20 |
| nvidia/nemotron-nano-12b-v2-vl | $0.200 | $0.600 | $0.400 |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | $0.100 | $0.400 | $0.250 |
| nvidia/nemotron-3-super-120b-a12b | $0.090 | $0.450 | $0.270 |
| nvidia/nemotron-3-nano-30b-a3b | $0.050 | $0.200 | $0.125 |
NVIDIA NIM Year 1 Total Cost by Company Size
Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.
Individual developer exploring NVIDIA NIM hosted endpoints using the free credits tier for prototyping and evaluation
Team consuming 10M input tokens and 10M output tokens per month via hosted NIM endpoints at provider median pricing ($0.095/1M input, $0.425/1M output)
Production application processing 50M input tokens and 50M output tokens per month via the highest-priced NVIDIA NIM model at $1.20/1M input and $1.20/1M output
CURRENT TIER DATA
How NVIDIA NIM Pricing Compares
| Software | Starting Price | Top Price |
|---|---|---|
| NVIDIA NIM | $0.1/per million tokens | $10/per million tokens |
| Amazon Bedrock | $0.07/per million tokens | $75/per million tokens |
| Anyscale | $0.15/per million tokens | $5/per million tokens |
| Baidu ERNIE API | $0.1/per million tokens | $10/per million tokens |
| Cerebras Inference API | $0.1/per million tokens | $6/per million tokens |
| Claude API | $0.03/per million tokens | $75/per million tokens |
Detailed pricing comparisons:
NVIDIA NIM Pricing FAQ
01 Does NVIDIA NIM have a free plan?
Yes. NVIDIA NIM includes a Developer (Free credits) tier at $0/month, giving developers access to hosted NIM endpoints for prototyping. When free credits are exhausted, the Pay-as-you-go (hosted NIM endpoints) tier applies based on token consumption.
02 How much does NVIDIA NIM cost per token on hosted endpoints?
Via OpenRouter, NVIDIA NIM-powered models range from $0.04 to $1.20 per million input tokens and $0.16 to $1.20 per million output tokens depending on the model. The provider median across 6 models is $0.095/1M input and $0.425/1M output. The largest model, Llama 3.1 Nemotron 70B Instruct, is $1.20/1M for both input and output tokens. The smallest model, Nemotron Nano 9B V2, starts at $0.04/1M input. Note: OpenRouter pricing reflects router-level rates and may include a modest markup over direct NVIDIA API pricing.
03 What does the NVIDIA NIM Enterprise plan include?
The Enterprise plan (AI Enterprise license + DGX Cloud) is custom-priced and intended for organizations that need to deploy NIM microservices at scale, either on NVIDIA DGX Cloud or on-premises infrastructure with enterprise support and SLAs. Contact NVIDIA sales for pricing.
04 Which models are available via NVIDIA NIM hosted endpoints?
Models available via hosted NIM endpoints (as listed on OpenRouter) include: Llama 3.1 Nemotron 70B Instruct ($1.20/1M input, $1.20/1M output), Nemotron Nano 12B V2 VL ($0.20/1M input, $0.60/1M output), Llama 3.3 Nemotron Super 49B V1.5 ($0.10/1M input, $0.40/1M output), Nemotron 3 Super 120B ($0.09/1M input, $0.45/1M output), Nemotron 3 Nano 30B ($0.05/1M input, $0.20/1M output), and Nemotron Nano 9B V2 ($0.04/1M input, $0.16/1M output).
Is this pricing incorrect? — we'll verify and update it.