OpenAI Creates a Voice Cloning AI Tool, Not Available for the Public Yet

IBL News | New York

OpenAI shared a preview of a model called Voice Engine, which allows users to upload a 15-second voice sample to generate a synthetic copy.

There is no date for public availability yet, as OpenAI says “it is taking a cautious and informed approach to a broader release due to the potential for synthetic voice misuse.”

“Any broad deployment of synthetic voice technology should be accompanied by voice authentication experiences that verify that the original speaker is knowingly adding their voice to the service and a no-go voice list that detects and prevents the creation of voices that are too similar to prominent figures,” stated the company.

Under development for about two years, this tool works like an expansion of the company’s existing text-to-speech API.

OpenAI has been testing this tool with a small group of partners, thinking about how it can be used for good across various industries.

The San Francisco-based research lab shared a few early examples, including providing real-time, personalized responses and reading assistance to non-readers and children through natural-sounding.

Another use is helping patients who suffer from sudden or degenerative speech conditions to recover their voice. The Norman Prince Neurosciences Institute at Lifespan was exploring the use of AI in clinical contexts.

One early adopter is HeyGen. It uses Voice Engine to translate a video speaker’s voice into multiple languages, preserving the native accent of the original user.