← All software
OpenAI Realtime API (gpt-realtime) logo
software Voice & Speech

OpenAI Realtime API (gpt-realtime)

by OpenAI

Low-latency speech-to-speech for voice agents

Pros

  • Industry-leading conversational latency
  • Tight tool-use integration
  • Caching dramatically cuts cost

Cons

  • Token-based pricing is hard to forecast
  • Uncached cost can exceed $0.30/min
  • Requires engineering to deploy

✓ Where it shines / best for

  • Developers building real-time voice agents and assistants
  • Voice-enabled customer support and phone automation
  • Low-latency conversational apps requiring tool use

✕ Not the best fit for

  • No-code users wanting a ready-made app
  • Cost-sensitive batch TTS where flat per-character pricing is cheaper
  • Offline/on-device speech needs

Features

  • ✓ Speech-to-speech over a single WebSocket/WebRTC connection (low latency)
  • ✓ Native audio in/out without separate STT and TTS pipelines
  • ✓ Function/tool calling during live voice conversations
  • ✓ Built-in voices with expressive, natural intonation
  • ✓ Interruption handling and turn detection (VAD)
  • ✓ Supports SIP for phone-call integration
  • ✓ Image input alongside audio in realtime sessions
  • ✓ Multilingual speech understanding and generation

Pricing

PlanPriceBillingNotes
Text input$4.00per 1M tokensgpt-realtime text input tokens
Text output$16.00per 1M tokensgpt-realtime text output tokens
Cached text input$0.40per 1M tokensCached input discount
Audio input$32.00per 1M tokensgpt-realtime audio input tokens
Cached audio input$0.40per 1M tokensCached audio input
Audio output$64.00per 1M tokensgpt-realtime audio output tokens

Pricing verified from the official source. Prices change often — confirm on the vendor's site before buying.

Specifications

modelgpt-realtime
modalityspeech-to-speech
Sponsored

A full review is being generated for this product and will appear here shortly.

Compare with

Compare
Compare