OpenAI Whisper Pricing 2026: $0.003-$0.006/Min API Costs

Price checkPer minute

GPT-4o Mini TranscribeFree Whisper / GPT-4o TranscribeFree Enterprise (via ChatGPT Enterprise / API)Custom

Quick Answer

Last verified: January 28, 2026

High confidence

Whisper (OpenAI) costs $0.00 to $0.01 per minute as of March 2026, with 3 plans available including a free tier. Plans: GPT-4o Mini Transcribe (free), Whisper / GPT-4o Transcribe (free), and Enterprise (via ChatGPT Enterprise / API) (free). Enterprise pricing is available on request. Pricing depends on your chosen tier, contract length, and negotiated discounts.

Use the interactive pricing calculator to estimate your exact cost based on team size and requirements.

Free tier: Yes

Whisper (OpenAI) offers 3 pricing tiers: GPT-4o Mini Transcribe, Whisper / GPT-4o Transcribe, Enterprise (via ChatGPT Enterprise / API). The Whisper / GPT-4o Transcribe plan is developers needing accurate, affordable transcription with the simplest possible integration and no add-on fees.

Compared to other ai transcription apis software, Whisper (OpenAI) is positioned at the budget-friendly price point.

6 documented hidden costs beyond list price

How much does Whisper (OpenAI) cost?

Whisper (OpenAI) offers 3 pricing plans, starting with a free tier and scaling to custom enterprise pricing. Plans include GPT-4o Mini Transcribe (free), Whisper / GPT-4o Transcribe (free), Enterprise (via ChatGPT Enterprise / API) (free).

Whisper (OpenAI) Pricing Overview

Whisper (OpenAI) has 3 pricing plans, including a free tier. Paid plans range from $0.00 to $0.01/minute. The GPT-4o Mini Transcribe plan is free and is best for cost-sensitive applications needing basic transcription at the lowest per-minute rate in openai's lineup. The Whisper / GPT-4o Transcribe plan is free and is best for developers needing accurate, affordable transcription with the simplest possible integration and no add-on fees. The Enterprise (via ChatGPT Enterprise / API) plan is free and is best for large organizations processing high volumes needing custom rate limits, enterprise security, and volume discounts.

There are at least 6 documented hidden costs beyond Whisper (OpenAI)'s list price, including implementation, training, and add-on fees.

This pricing was last verified in January 28, 2026 from 5 independent sources.

Visit Whisper (OpenAI) Pricing

OpenAI Whisper API pricing is $0.003 to $0.006 per minute as of March 2026. GPT-4o Mini Transcribe costs $0.003/min ($0.18/hour), while Whisper and GPT-4o Transcribe cost $0.006/min ($0.36/hour). New accounts receive $5 in free credits covering approximately 833 minutes of transcription. Billing is per second with no minimum charge. Verified from 5 pricing sources by Costbench, the software pricing database tracking 1,000+ products.

OpenAI Whisper is a general-purpose speech recognition system available both as an open-source model for self-hosting and as a hosted API. Trained on 680,000 hours of multilingual audio, Whisper supports 99+ languages and delivers high-accuracy transcription at one of the lowest per-minute rates among commercial transcription APIs. OpenAI also offers GPT-4o Transcribe and GPT-4o Mini Transcribe as newer alternatives with improved accuracy and built-in speaker diarization.

The key trade-off with Whisper is simplicity vs features. At $0.006/min, Whisper is 75% cheaper than AWS Transcribe ($0.024/min) and 62% cheaper than Google Cloud Speech-to-Text ($0.016/min) at base rates. However, Whisper lacks built-in audio intelligence features like entity detection, topic classification, and sentiment analysis that competitors like AssemblyAI and Deepgram offer. For teams already in the OpenAI ecosystem using GPT-4 and embeddings, Whisper provides the simplest possible integration with a single API call and flat-rate pricing.

In this 2026 pricing guide, we break down Whisper's per-minute costs across all model variants, calculate real-world costs for common transcription workloads, expose hidden costs around file size limits, rate throttling, and compliance gaps, and compare Whisper to alternatives like Deepgram, AssemblyAI, AWS Transcribe, and Google Cloud Speech-to-Text.

How Whisper (OpenAI) Pricing Compares

Whisper (OpenAI) starts at $0.003/minute. Compare: AssemblyAI ($0.15–$0.37/minute), Deepgram ($0.00–$0.02/minute), Rev AI ($0.00–$0.02/minute).

AssemblyAI $0.15–$0.37/minute View pricing → Deepgram $0.00–$0.02/minute View pricing → Rev AI $0.00–$0.02/minute View pricing →

All Whisper (OpenAI) Plans & Pricing

Plan	Monthly	Annual	Best For
GPT-4o Mini Transcribe Max file size: 25 MB per requestRate limits: Tier-based (50 RPM default)	Free	Free	Cost-sensitive applications needing basic transcription at the lowest per-minute rate in OpenAI's lineup
Whisper / GPT-4o Transcribe Max file size: 25 MB per requestRate limits: Tier-based (scales with usage)	Free	Free	Developers needing accurate, affordable transcription with the simplest possible integration and no add-on fees
Enterprise (via ChatGPT Enterprise / API) Minimum commitment: Custom (contact sales)Rate limits: Custom (significantly higher)	Contact Sales	Contact Sales	Large organizations processing high volumes needing custom rate limits, enterprise security, and volume discounts

View all features by plan

GPT-4o Mini Transcribe

Speech-to-text at $0.003/min ($0.18/hour)
99+ language support
Multiple audio format support (mp3, mp4, wav, webm, etc.)
Billed per second with no minimum charge
Near real-time processing (5-10x faster than real-time)
Also available as token-based pricing at $1.25/1M input tokens

Whisper / GPT-4o Transcribe

Speech-to-text at $0.006/min ($0.36/hour)
Whisper (legacy) and GPT-4o Transcribe at same rate
GPT-4o Transcribe with speaker diarization at $0.006/min
99+ language support with improved accuracy
Billed per second with no minimum charge
$5 free credits for new accounts (~833 minutes)
Also available as token-based pricing at $2.50/1M input tokens

Enterprise (via ChatGPT Enterprise / API)

All Whisper and GPT-4o Transcribe models
Higher rate limits and concurrency
Dedicated account management
Custom usage-based volume discounts
Admin controls and SSO
Data processing addendum (DPA) available
Priority access to new models and features

See Whisper (OpenAI) Plans & Sign Up

Compare Whisper (OpenAI) vs Alternatives

Before committing to Whisper (OpenAI), compare pricing with these 3 alternatives in the same category.

VSAssemblyAI

From $0.15

Teams needing rich audio intelligence features (entity detection, topic classification, summarization) beyond basic transcription

Compare pricing

VSDeepgram

From $0.0043

Real-time streaming applications needing ultra-low latency and per-second billing granularity

Compare pricing

VSRev AI

From $0.003

High-volume batch transcription with human transcription fallback for mission-critical accuracy

Compare pricing

All Whisper (OpenAI) alternatives & migration guides

Visit Whisper (OpenAI) Pricing

Whisper (OpenAI) Year 1 Total Cost by Company Size

Real deployment costs including licenses, implementation, training, and admin — not just the sticker price.

Startup Podcast Transcription (100 hours/month) $36 Year 1 total

$432/year

Total $36

A content startup transcribing 100 hours of podcast audio monthly using GPT-4o Transcribe for high accuracy with speaker diarization. Processing 6,000 minutes per month.

SaaS Meeting Recorder (500 hours/month) $180 Year 1 total

$2,160/year

Total $180

A meeting productivity SaaS transcribing 500 hours of meetings monthly using GPT-4o Transcribe with diarization for speaker identification across 30,000 minutes per month.

Enterprise Call Center (5,000 hours/month) $1,800 Year 1 total

$21,600/year

Total $1,800

A large enterprise transcribing 5,000 hours of customer calls monthly using GPT-4o Transcribe, requiring high accuracy and speaker diarization across 300,000 minutes per month.

How Whisper (OpenAI) Pricing Compares

Software	Starting Price	Top Price
Whisper (OpenAI)	$0.003/minute	$0.006/minute
AssemblyAI	Free	$75/hour
AWS Transcribe	Free	$6.75/minute
Deepgram	Custom	Custom
Google Cloud Speech-to-Text	Custom	Custom
Rev AI	$0.00167/minute	$0.033/minute

Browse all AI Transcription APIs pricing →

6 Whisper (OpenAI) Hidden Costs Beyond the List Price

Beyond the listed price, Whisper (OpenAI) has at least 6 documented hidden costs that can significantly increase total cost of ownership.

Watch for 6 hidden costs

Actual costs may exceed the $0.006/min headline rate: Developer reports indicate real-world costs averaging $0.010/min due to billing rounding, retries on failed requests, and processing overhead -- across 648 hours one developer reported spending $397 vs an estimated $233 (70% over budget)
No built-in speaker diarization on legacy Whisper model: While GPT-4o Transcribe now includes diarization at $0.006/min, the legacy Whisper model requires a separate post-processing step using GPT-4o or a third-party service, adding $0.002-$0.01/min in additional costs
25 MB file size limit forces chunking overhead: Audio files over 25 MB must be split into smaller segments before upload, requiring engineering effort for chunk management, overlap handling, and transcript reassembly -- budget $500-$1,500 for initial chunking pipeline development
No HIPAA BAA available: OpenAI does not offer a Business Associate Agreement, making the Whisper API unusable for Protected Health Information (PHI) -- organizations with healthcare data must self-host Whisper on HIPAA-compliant infrastructure at $1,400+/month
Self-hosting break-even at 500+ hours/month: At $0.006/min, 500 hours costs $180/month via API vs ~$276/month for self-hosted GPU infrastructure -- above 500 hours self-hosting becomes cheaper but requires DevOps expertise and GPU management overhead
Rate limits throttle high-volume processing: Default tier allows only 50 requests per minute -- processing 10,000+ files requires careful queue management, retry logic, and potentially upgrading to higher API tiers which require spending history with OpenAI

Tip

Ask your Whisper (OpenAI) sales rep about these costs upfront. Getting them in writing before signing can save you from surprise charges later.

Full hidden costs breakdown →

Whisper (OpenAI) Pricing FAQ

01 How much does OpenAI Whisper API cost?

OpenAI Whisper API costs $0.006 per minute ($0.36/hour) for both the legacy Whisper model and the newer GPT-4o Transcribe model. GPT-4o Mini Transcribe is available at $0.003/min ($0.18/hour) for cost-sensitive workloads. GPT-4o Transcribe with speaker diarization is also $0.006/min. New accounts receive $5 in free credits covering approximately 833 minutes of Whisper transcription.

02 Is OpenAI Whisper free?

OpenAI Whisper is available as both a free open-source model and a paid API. The open-source model can be self-hosted for free (but requires GPU infrastructure costing $276+/month). The API gives new accounts $5 in free credits covering approximately 833 minutes of transcription. After credits are exhausted, you pay $0.006/min for Whisper or $0.003/min for GPT-4o Mini Transcribe with no free monthly refresh.

03 What is OpenAI Whisper?

OpenAI Whisper is a general-purpose speech recognition model trained on 680,000 hours of multilingual audio data. It is available both as an open-source model (for self-hosting) and as a hosted API through OpenAI's platform. Whisper supports 99+ languages for transcription and translation, handles various audio formats, and processes audio at 5-10x real-time speed. OpenAI has also released GPT-4o Transcribe and GPT-4o Mini Transcribe as successor models offering improved accuracy and features like built-in speaker diarization.

04 OpenAI Whisper vs Deepgram: which is better?

OpenAI Whisper costs $0.006/min vs Deepgram Nova-3 at $0.0043-$0.0077/min depending on plan. Deepgram is cheaper at scale with its Growth plan ($0.0043/min) and offers real-time streaming with sub-300ms latency, while Whisper processes at 5-10x real-time but is not designed for live streaming. Deepgram also provides $200 in free credits vs Whisper's $5. Choose Whisper for simple integration within the OpenAI ecosystem and batch transcription; choose Deepgram for real-time applications, lower per-minute costs at volume, and richer audio intelligence features.

05 OpenAI Whisper vs AWS Transcribe: which is cheaper?

OpenAI Whisper at $0.006/min is 75% cheaper than AWS Transcribe's base rate of $0.024/min. However, AWS Transcribe offers volume-based tiered discounts dropping to $0.0078/min at 5M+ minutes, narrowing the gap for very high-volume users. AWS also includes built-in features like speaker diarization, custom vocabularies, and PII redaction that Whisper lacks natively. Choose Whisper for straightforward, affordable transcription; choose AWS Transcribe if you need deep AWS ecosystem integration, custom vocabularies, or HIPAA-compliant medical transcription.

06 Can I self-host OpenAI Whisper for free?

Yes, Whisper is open-source under the MIT license and can be self-hosted on your own infrastructure at no software cost. However, self-hosting requires GPU infrastructure costing approximately $276/month minimum for a dedicated GPU instance, plus DevOps overhead of $50-$200/month. The break-even point vs the API is roughly 500 hours of transcription per month. Self-hosting makes sense for organizations needing data sovereignty, HIPAA compliance, or processing extremely high volumes where the $0.006/min API rate exceeds fixed infrastructure costs.

07 What audio formats does OpenAI Whisper support?

The Whisper API supports mp3, mp4, mpeg, mpga, m4a, wav, and webm audio formats with a maximum file size of 25 MB per request. Files larger than 25 MB must be split into smaller chunks before upload. The API processes audio at 5-10x real-time speed, meaning a 60-minute file typically completes in 6-12 minutes. Billing is calculated per second of audio processed with no minimum charge per request.

08 Does OpenAI Whisper charge for silence in audio?

Yes, OpenAI Whisper charges for the full duration of submitted audio including silence, music, and non-speech segments. A 60-minute file with 30 minutes of silence costs the same $0.36 as a 60-minute file of continuous speech. To reduce costs, preprocess audio with tools like FFmpeg to strip silence or use voice activity detection (VAD) before sending to the API. This can reduce costs by 20-40% for recordings with significant dead air, such as meeting recordings or surveillance audio.

Is this pricing incorrect? — we verify and update within 24 hours.