NVIDIA NIM Pricing 2026: Plans & Hidden Costs

Price checkPer per million tokens

Developer (Free credits)Free Pay-as-you-go (hosted NIM endpoints)Custom Enterprise (AI Enterprise license + DGX Cloud)Custom

See all 3 plans

Quick Answer

Last verified: April 23, 2026

Medium confidence

NVIDIA NIM costs $0.10 to $10 per per million tokens as of April 2026, with 3 plans available including a free tier. Plan: Developer (Free credits) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

NVIDIA NIM offers 3 pricing tiers: Developer (Free credits), Pay-as-you-go (hosted NIM endpoints), Enterprise (AI Enterprise license + DGX Cloud). The Pay-as-you-go (hosted NIM endpoints) plan is production inference on open models.

Compared to other llm api providers software, NVIDIA NIM is positioned at the budget-friendly price point.

How much does NVIDIA NIM cost?

NVIDIA NIM offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Developer (Free credits) (free), Pay-as-you-go (hosted NIM endpoints) (custom pricing), Enterprise (AI Enterprise license + DGX Cloud) (custom pricing).

NVIDIA NIM Pricing Overview

NVIDIA NIM has 3 pricing plans, including a free tier. Paid plans range from $0.10 to $10/per million tokens. The Developer (Free credits) plan is free and is best for prototyping and evaluation. The Pay-as-you-go (hosted NIM endpoints) plan requires contacting sales for a custom quote and is designed for production inference on open models. The Enterprise (AI Enterprise license + DGX Cloud) plan requires contacting sales for a custom quote and is designed for regulated enterprise inference workloads.

This pricing was last verified in April 23, 2026.

Try NVIDIA NIM Free

NVIDIA NIM pricing starts at $0/month on the Developer (Free credits) plan, giving developers API access to hosted NIM microservice endpoints. The Pay-as-you-go (hosted NIM endpoints) tier charges per token consumed, with hosted model rates on OpenRouter ranging from $0.04–$1.20 per million input tokens depending on model size. Organizations requiring enterprise deployment on DGX Cloud or on-premises infrastructure are served by the Enterprise (AI Enterprise license + DGX Cloud) plan, which is custom-quoted.

How NVIDIA NIM Pricing Compares

Compare NVIDIA NIM pricing against top alternatives in LLM API Providers.

Groq $0-$3.0/per million tokens Compare → Together AI $0.03-$9.95/per million tokens / hour Compare → Fireworks AI $0-$9/per million tokens / hour Compare →

All NVIDIA NIM Plans & Pricing

Plan	Monthly	Annual	Best For
Developer (Free credits)	Free	Free	Prototyping and evaluation
Pay-as-you-go (hosted NIM endpoints)	Custom	Custom	Production inference on open models
Enterprise (AI Enterprise license + DGX Cloud)	Contact Sales	Contact Sales	Regulated enterprise inference workloads

View all features by plan

Developer (Free credits)

Free API credits on build.nvidia.com
Access to 50+ hosted models
NVIDIA, Meta, Mistral, Microsoft open models

Pay-as-you-go (hosted NIM endpoints)

Llama 3.3 70B: $0.90/1M tokens blended
Mixtral 8x22B: $1.20/1M
Nemotron 70B: $0.90/1M
Hosted behind NVIDIA inference infra

Enterprise (AI Enterprise license + DGX Cloud)

AI Enterprise subscription
DGX Cloud dedicated capacity
On-prem deployment support

Try NVIDIA NIM Free

Usage-Based Rates

Per-unit pricing for NVIDIA NIM API usage.

Pay-as-you-go (hosted NIM endpoints)

Model	Unit	Rate
llama-3-3-70b-nim 131072 ctx	1M tokens	$0.900
mixtral-8x22b-nim 65536 ctx	1M tokens	$1.20
nemotron-70b 131072 ctx	1M tokens	$0.900

NIM = NVIDIA Inference Microservice. Same pricing on cloud-hosted + on-prem DGX deployments.

Compare NVIDIA NIM vs Alternatives

Before committing to NVIDIA NIM, compare pricing with these 3 alternatives in the same category.

VSGroq

Free

Prototyping and evaluation

Full comparison

VSTogether AI

From $0.03/per million tokens / hour

Variable-volume API usage

Full comparison

VSFireworks AI

Free

Variable-volume API usage

Full comparison

All NVIDIA NIM alternatives & migration guides

What Companies Actually Pay for NVIDIA NIM

Median per-1M-token pricing across 6 models

Input $0.095/1M

Output $0.425/1M

Flagship models in this provider's catalog

Model	Input /1M	Output /1M	Blended /1M
nvidia/llama-3.1-nemotron-70b-instruct	$1.20	$1.20	$1.20
nvidia/nemotron-nano-12b-v2-vl	$0.200	$0.600	$0.400
nvidia/llama-3.3-nemotron-super-49b-v1.5	$0.100	$0.400	$0.250
nvidia/nemotron-3-super-120b-a12b	$0.090	$0.450	$0.270
nvidia/nemotron-3-nano-30b-a3b	$0.050	$0.200	$0.125

Review scores

Source: OpenRouter API — medians aggregated from 6 models routed. Reflects router-surface pricing (may include modest markup vs direct provider rates).

NVIDIA NIM Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Developer Evaluation $0 Year 1 total

Individual developer exploring NVIDIA NIM hosted endpoints using the free credits tier for prototyping and evaluation

Small team at median token pricing $5 Year 1 total

$62.40/year

Total $5

Team consuming 10M input tokens and 10M output tokens per month via hosted NIM endpoints at provider median pricing ($0.095/1M input, $0.425/1M output)

Production app on Llama 3.1 Nemotron 70B $120 Year 1 total

$1,440/year

Total $120

Production application processing 50M input tokens and 50M output tokens per month via the highest-priced NVIDIA NIM model at $1.20/1M input and $1.20/1M output

CURRENT TIER DATA

How NVIDIA NIM Pricing Compares

Software	Starting Price	Top Price
NVIDIA NIM	$0.1/per million tokens	$10/per million tokens
Amazon Bedrock	$0.07/per million tokens	$75/per million tokens
Anyscale	$0.15/per million tokens	$5/per million tokens
Baidu ERNIE API	$0.1/per million tokens	$10/per million tokens
Cerebras Inference API	$0.1/per million tokens	$6/per million tokens
Claude API	$0.03/per million tokens	$75/per million tokens

Detailed pricing comparisons:

Browse all LLM API Providers pricing →

NVIDIA NIM Pricing FAQ

01 Does NVIDIA NIM have a free plan?

Yes. NVIDIA NIM includes a Developer (Free credits) tier at $0/month, giving developers access to hosted NIM endpoints for prototyping. When free credits are exhausted, the Pay-as-you-go (hosted NIM endpoints) tier applies based on token consumption.

02 How much does NVIDIA NIM cost per token on hosted endpoints?

Via OpenRouter, NVIDIA NIM-powered models range from $0.04 to $1.20 per million input tokens and $0.16 to $1.20 per million output tokens depending on the model. The provider median across 6 models is $0.095/1M input and $0.425/1M output. The largest model, Llama 3.1 Nemotron 70B Instruct, is $1.20/1M for both input and output tokens. The smallest model, Nemotron Nano 9B V2, starts at $0.04/1M input. Note: OpenRouter pricing reflects router-level rates and may include a modest markup over direct NVIDIA API pricing.

03 What does the NVIDIA NIM Enterprise plan include?

The Enterprise plan (AI Enterprise license + DGX Cloud) is custom-priced and intended for organizations that need to deploy NIM microservices at scale, either on NVIDIA DGX Cloud or on-premises infrastructure with enterprise support and SLAs. Contact NVIDIA sales for pricing.

04 Which models are available via NVIDIA NIM hosted endpoints?

Models available via hosted NIM endpoints (as listed on OpenRouter) include: Llama 3.1 Nemotron 70B Instruct ($1.20/1M input, $1.20/1M output), Nemotron Nano 12B V2 VL ($0.20/1M input, $0.60/1M output), Llama 3.3 Nemotron Super 49B V1.5 ($0.10/1M input, $0.40/1M output), Nemotron 3 Super 120B ($0.09/1M input, $0.45/1M output), Nemotron 3 Nano 30B ($0.05/1M input, $0.20/1M output), and Nemotron Nano 9B V2 ($0.04/1M input, $0.16/1M output).

Is this pricing incorrect? — we'll verify and update it.