Quick Answer
Last verified:
Medium confidence

NVIDIA NIM costs $0.10 to $10 per per million tokens as of April 2026, with 3 plans available including a free tier. Plan: Developer (Free credits) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

  • Free tier: Yes

NVIDIA NIM offers 3 pricing tiers: Developer (Free credits), Pay-as-you-go (hosted NIM endpoints), Enterprise (AI Enterprise license + DGX Cloud). The Pay-as-you-go (hosted NIM endpoints) plan is production inference on open models.

Compared to other llm api providers software, NVIDIA NIM is positioned at the budget-friendly price point.

    0

How much does NVIDIA NIM cost?

NVIDIA NIM offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include Developer (Free credits) (free), Pay-as-you-go (hosted NIM endpoints) (custom pricing), Enterprise (AI Enterprise license + DGX Cloud) (custom pricing).

NVIDIA NIM Pricing Overview

NVIDIA NIM has 3 pricing plans, including a free tier. Paid plans range from $0.10 to $10/per million tokens. The Developer (Free credits) plan is free and is best for prototyping and evaluation. The Pay-as-you-go (hosted NIM endpoints) plan requires contacting sales for a custom quote and is designed for production inference on open models. The Enterprise (AI Enterprise license + DGX Cloud) plan requires contacting sales for a custom quote and is designed for regulated enterprise inference workloads.

This pricing was last verified in April 23, 2026.

NVIDIA NIM pricing starts at $0/month on the Developer (Free credits) plan, giving developers API access to hosted NIM microservice endpoints. The Pay-as-you-go (hosted NIM endpoints) tier charges per token consumed, with hosted model rates on OpenRouter ranging from $0.04–$1.20 per million input tokens depending on model size. Organizations requiring enterprise deployment on DGX Cloud or on-premises infrastructure are served by the Enterprise (AI Enterprise license + DGX Cloud) plan, which is custom-quoted.

How NVIDIA NIM Pricing Compares

Compare NVIDIA NIM pricing against top alternatives in LLM API Providers.

All NVIDIA NIM Plans & Pricing

Plan Monthly Annual Best For
Developer (Free credits) Free Free Prototyping and evaluation
Pay-as-you-go (hosted NIM endpoints) Custom Custom Production inference on open models
Enterprise (AI Enterprise license + DGX Cloud) Contact Sales Contact Sales Regulated enterprise inference workloads
View all features by plan

Developer (Free credits)

  • Free API credits on build.nvidia.com
  • Access to 50+ hosted models
  • NVIDIA, Meta, Mistral, Microsoft open models

Pay-as-you-go (hosted NIM endpoints)

  • Llama 3.3 70B: $0.90/1M tokens blended
  • Mixtral 8x22B: $1.20/1M
  • Nemotron 70B: $0.90/1M
  • Hosted behind NVIDIA inference infra

Enterprise (AI Enterprise license + DGX Cloud)

  • AI Enterprise subscription
  • DGX Cloud dedicated capacity
  • On-prem deployment support

Usage-Based Rates

Per-unit pricing for NVIDIA NIM API usage.

Pay-as-you-go (hosted NIM endpoints)

Model Unit Rate
llama-3-3-70b-nim 131072 ctx 1M tokens $0.900
mixtral-8x22b-nim 65536 ctx 1M tokens $1.20
nemotron-70b 131072 ctx 1M tokens $0.900
  • NIM = NVIDIA Inference Microservice. Same pricing on cloud-hosted + on-prem DGX deployments.

Compare NVIDIA NIM vs Alternatives

Before committing to NVIDIA NIM, compare pricing with these 3 alternatives in the same category.

All NVIDIA NIM alternatives & migration guides

What Companies Actually Pay for NVIDIA NIM

Median per-1M-token pricing across 6 models
Input $0.095/1M
Output $0.425/1M
Flagship models in this provider's catalog
Model Input /1M Output /1M Blended /1M
nvidia/llama-3.1-nemotron-70b-instruct $1.20 $1.20 $1.20
nvidia/nemotron-nano-12b-v2-vl $0.200 $0.600 $0.400
nvidia/llama-3.3-nemotron-super-49b-v1.5 $0.100 $0.400 $0.250
nvidia/nemotron-3-super-120b-a12b $0.090 $0.450 $0.270
nvidia/nemotron-3-nano-30b-a3b $0.050 $0.200 $0.125
Review scores
Source: OpenRouter API — medians aggregated from 6 models routed. Reflects router-surface pricing (may include modest markup vs direct provider rates).

NVIDIA NIM Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Developer Evaluation $0 Year 1 total

Individual developer exploring NVIDIA NIM hosted endpoints using the free credits tier for prototyping and evaluation

Small team at median token pricing $5 Year 1 total
$62.40/year
Total $5

Team consuming 10M input tokens and 10M output tokens per month via hosted NIM endpoints at provider median pricing ($0.095/1M input, $0.425/1M output)

Production app on Llama 3.1 Nemotron 70B $120 Year 1 total
$1,440/year
Total $120

Production application processing 50M input tokens and 50M output tokens per month via the highest-priced NVIDIA NIM model at $1.20/1M input and $1.20/1M output

CURRENT TIER DATA

How NVIDIA NIM Pricing Compares

Software Starting Price Top Price
NVIDIA NIM $0.1/per million tokens $10/per million tokens
Amazon Bedrock $0.07/per million tokens $75/per million tokens
Anyscale $0.15/per million tokens $5/per million tokens
Baidu ERNIE API $0.1/per million tokens $10/per million tokens
Cerebras Inference API $0.1/per million tokens $6/per million tokens
Claude API $0.03/per million tokens $75/per million tokens

NVIDIA NIM Pricing FAQ

01 Does NVIDIA NIM have a free plan?

Yes. NVIDIA NIM includes a Developer (Free credits) tier at $0/month, giving developers access to hosted NIM endpoints for prototyping. When free credits are exhausted, the Pay-as-you-go (hosted NIM endpoints) tier applies based on token consumption.

02 How much does NVIDIA NIM cost per token on hosted endpoints?

Via OpenRouter, NVIDIA NIM-powered models range from $0.04 to $1.20 per million input tokens and $0.16 to $1.20 per million output tokens depending on the model. The provider median across 6 models is $0.095/1M input and $0.425/1M output. The largest model, Llama 3.1 Nemotron 70B Instruct, is $1.20/1M for both input and output tokens. The smallest model, Nemotron Nano 9B V2, starts at $0.04/1M input. Note: OpenRouter pricing reflects router-level rates and may include a modest markup over direct NVIDIA API pricing.

03 What does the NVIDIA NIM Enterprise plan include?

The Enterprise plan (AI Enterprise license + DGX Cloud) is custom-priced and intended for organizations that need to deploy NIM microservices at scale, either on NVIDIA DGX Cloud or on-premises infrastructure with enterprise support and SLAs. Contact NVIDIA sales for pricing.

04 Which models are available via NVIDIA NIM hosted endpoints?

Models available via hosted NIM endpoints (as listed on OpenRouter) include: Llama 3.1 Nemotron 70B Instruct ($1.20/1M input, $1.20/1M output), Nemotron Nano 12B V2 VL ($0.20/1M input, $0.60/1M output), Llama 3.3 Nemotron Super 49B V1.5 ($0.10/1M input, $0.40/1M output), Nemotron 3 Super 120B ($0.09/1M input, $0.45/1M output), Nemotron 3 Nano 30B ($0.05/1M input, $0.20/1M output), and Nemotron Nano 9B V2 ($0.04/1M input, $0.16/1M output).

Is this pricing incorrect? — we'll verify and update it.