OpenAI has 3 new AI voice models that the ChatGPT maker says will ‘unlock a new class of voice apps for developers’
- OpenAI has launched three new artificial intelligence (AI) models
- They’re for real-time voice tasks: reasoning, translation, and transcription
- Each one is designed to be integrated into developers’ AI apps
If you’re a regular ChatGPT user, you might be aware that you don’t have to interact with the artificial intelligence (AI) chatbot purely through text — it can speak to you and take your voice requests, too. Now, ChatGPT maker OpenAI has announced three new voice models that it believes will “unlock a new class of voice apps for developers.”
Each AI voice model is designed for a different purpose, including in-depth reasoning, translation, and transcription. If you’re looking for a voice model along those lines, they could be worth a shot.
According to OpenAI, the new models include the following:
- “GPT‑Realtime‑2, our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally.
- “GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker.
- “GPT‑Realtime‑Whisper, a new streaming speech-to-text that transcribes speech live as the speaker talks.”
OpenAI’s news post explains that the company has seen developers use AI voice models in three distinct ways: by asking the AI to carry out a task; by having the AI explain a situation (such as a travel delay) to the user; and by having conversations in the user’s local language.
It’s those use cases that OpenAI is trying to address with its new voice models. Each is designed for developers to use in their own apps, and all three are available as part of OpenAI’s Realtime API. GPT-Realtime-2 will cost $32 per one million input tokens and $64 per one million output tokens. GPT-Realtime-Translate is priced at $0.034 per minute, while GPT-Realtime-Whisper costs $0.017 per minute.

If you’re after an AI model that is able to reason deeply and adapt to conversation flows, OpenAI says the new GPT-Realtime-2 option is for you. Developers can use it to check multiple sources at once, adjust its tone depending on the user’s input, tap into more advanced reasoning levels, and parse specialized terms (such as proper nouns and expressions used in healthcare and production).
Translation apps, on the other hand, can put GPT-Realtime-Translate to use converting speech in real time. Users will be able to speak their own language and have it translated and transcribed without delay. This model works with over 70 input languages and 13 output languages.
And if you want audio to be transcribed quickly and accurately, there’s GPT-Realtime-Whisper. This model is useful for creating captions, meeting notes, and summaries as conversations are ongoing, OpenAI says, which means “live products can feel faster, more responsive, and more natural.”
If you want to try out any of the new models, they’re available in OpenAI’s Playground site. And if you’re using Codex, OpenAI has created a prompt that will directly add GPT-Realtime-2 to the agentic coding platform.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

The best laptops for all budgets
