OpenAI Launches GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper
OpenAI
OpenAI released three new realtime voice models on May 7. GPT-Realtime-2 is the first voice model with GPT-5-class reasoning, a 128k token context window, and adjustable reasoning intensity levels. GPT-Realtime-Translate provides live speech translation from 70+ input languages into 13 output languages. GPT-Realtime-Whisper streams speech-to-text transcription in real time. All three are available via the OpenAI API and developer playground.
Why it matters
First OpenAI voice model to bring GPT-5-class reasoning into the real-time audio pathway — enabling complex multi-turn voice agents with live translation at scale, directly competing with ElevenLabs, Cartesia, and Deepgram on developer voice infrastructure.
Importance: 3/5
OpenAI frontier lab; GPT-5-class reasoning in real-time voice is a new capability tier for voice agent developers, plus live speech translation across 70+ languages.