OpenAI Launches GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper

OpenAI

Audio official + media 3 src. ~1 min

OpenAI released three new realtime voice models on May 7. GPT-Realtime-2 is the first voice model with GPT-5-class reasoning, a 128k token context window, and adjustable reasoning intensity levels. GPT-Realtime-Translate provides live speech translation from 70+ input languages into 13 output languages. GPT-Realtime-Whisper streams speech-to-text transcription in real time. All three are available via the OpenAI API and developer playground.

Why it matters

First OpenAI voice model to bring GPT-5-class reasoning into the real-time audio pathway — enabling complex multi-turn voice agents with live translation at scale, directly competing with ElevenLabs, Cartesia, and Deepgram on developer voice infrastructure.

Importance: 3/5

OpenAI frontier lab; GPT-5-class reasoning in real-time voice is a new capability tier for voice agent developers, plus live speech translation across 70+ languages.

Sources