← All software
Authentic, lifelike TTS voice models built for scale
Pros
- Cheap per-minute (~$0.018-$0.024)
- Built for voice-agent scale
- Flexible billing models
- Natural conversational voices
Cons
- Smaller voice catalog than ElevenLabs
- Developer-focused, no consumer UI
- Voice cloning gated to higher tiers
✓ Where it shines / best for
- Developers building real-time conversational voice agents
- Contact center and telephony applications needing low latency
- Enterprises requiring on-prem voice synthesis
✕ Not the best fit for
- Non-technical users wanting a GUI reader app
- Long-form audiobook/voiceover where latency is irrelevant
- Offline mobile on-device use without infrastructure
Features
- ✓ API access
- ✓ Free tier
- ✓ Text-to-speech
- ✓ Real-time
- ✓ On-device / offline
- ✓ Voice agents
- ✓ Low latency
- ✓ Streaming
- ✓ Conversational
- ✓ On Premise
Pricing
| Plan | Price | Billing | Notes |
|---|---|---|---|
| Free | $0 | free | Free credits/characters to test the TTS API |
| Pay-as-you-go | From ~$0.0006 | per generated second | Usage-based low-latency TTS; pay per audio second/characters |
| Pro / Growth | From ~$100 | month | Higher rate limits, volume pricing, more concurrency |
| Enterprise / On-prem | Custom | custom | Self-hosted/on-prem deployment, SLAs, dedicated support; contact sales |
Pricing verified from the official source. Prices change often — confirm on the vendor's site before buying.
Specifications
| models | Mist v3, Arcana v3 |
| billing | per-char or per-min |
Sponsored
A full review is being generated for this product and will appear here shortly.