OpenAI introduces new audio tools in its API

[ad_1]

OpenAI said Thursday that its API will now include a number of voice-intelligence tools designed to help developers create apps that can speak, write, and translate conversations with users.

The company The new GPT-Realtime-2 is another type of speech, built to create a sound simulation that can be used to communicate with users. However, unlike the previous one (GPT-Realtime-1.5) this one is built with GPT-5-class logic that OpenAI says is designed to handle the most complex requests from users.

The company is also launching GPT-Realtime-Translate which, as it sounds, is designed to provide real-time translation services that “walk” with the user, in conversation. This feature includes more than 70 input languages (that is, languages that can be heard) and 13 output languages (languages that are presented to the speaker).

Finally, the company has also introduced a new texting technology, GPT-Realtime-Whisper, which gives users the ability to speak-texts that are recorded as the interaction takes place.

“Together, the models we’re introducing move real-time audio from simple call-and-response to a voice-based communication environment that can work: listen, think, interpret, record, and take action as conversations flow,” the company said.

Who will these changes be good for? Companies looking to improve customer experience are an obvious target. However, OpenAI also claims that its innovations will support a variety of sectors, including education, media, events, and creative platforms, among others.

While these tools seem useful for business purposes, it also seems clear that they can be misused. The company said it has built in security measures to prevent its innovations from being misused to commit spam, fraud, or other online abuse. Certain triggers have been put into the system so that “conversations can be suspended if they are found to be violating our privacy guidelines,” OpenAI said.

Techcrunch event

San Francisco, CA
| |
October 13-15, 2026

All new voice types are included OpenAI’s Realtime API. Translate and Whisper are paid by the minute, while GPT-Realtime-2 is paid by using tokens.

When you purchase through links in our articles, we can get a little work. This does not affect our authorship.

[ad_2]

Source link

Leave a ReplyCancel Reply