← All software
The open-source speech recognition model powering a generation of dictation apps — free to self-host, $0.006/min via API.
✓ Where it shines / best for
- Developers building custom dictation or transcription applications
- Self-hosters wanting maximum privacy and zero API cost
- Researchers needing a reliable multilingual ASR baseline
- Teams evaluating a foundation model before choosing a managed service
✕ Not the best fit for
- Non-technical users needing a ready-to-use consumer app
- Users needing real-time streaming (Whisper is not optimized for sub-300ms streaming)
- Projects requiring SLAs, support, or enterprise compliance from a vendor
Features
- ✓ Open source under MIT License — free to use, modify, and deploy
- ✓ Trained on 680,000 hours of multilingual web audio
- ✓ 99 language support with strong multilingual accuracy
- ✓ Robust performance on accented, noisy, and domain-specific audio
- ✓ Multiple model sizes: Tiny, Base, Small, Medium, Large (whisper-large-v3)
- ✓ Available via OpenAI API or self-hosted with whisper.cpp / WhisperKit
- ✓ Foundation for dozens of third-party dictation apps
Pricing
| Plan | Price | Billing | Notes |
|---|---|---|---|
| Self-hosted (open source) | $0 | forever | MIT License; runs locally via whisper.cpp, WhisperKit, or Python; compute cost only |
| OpenAI API (Whisper-1 / GPT-4o Transcribe) | $0.006 | per minute | Via OpenAI API; $0.36/hr; supports mp3, mp4, m4a, wav, webm; 25MB max file size |
| OpenAI API (GPT-4o-mini Transcribe) | $0.003 | per minute | $0.18/hr; lower cost option; $5 free credit on new OpenAI account |
Prices change often — confirm on the vendor's site before buying.
Sponsored
A full review is being generated for this product and will appear here shortly.