-->

2024 Speech Industry Award Winner: ElevenLabs Is Dubbed a Leader in Automatic Speech Translation

Article Featured Image

Piotr Dabkowski and Mateusz Staniszewski were raised in Poland and established ElevenLabs in 2022 out of frustration from watching American films with lousy dubbing.

It’s not surprising, then, that ElevenLabs’ first product was AI Dubbing, which it released in October. AI Dubbing enables users to automatically translate any speech into up to 20 other languages while maintaining the original speakers’ voices, speech patterns, emotions, and intonations and simultaneously handling other tasks, like noise removal, speaker differentiation, transcription, and synchronization of translated speech with the original audio.

AI Dubbing combines the company’s research on multilingual speech synthesis, voice cloning, and text and audio processing into a single tool supported by the Eleven Multilingual v2 model that the company released around the same time.

AI Dubbing “will help audiences enjoy any content they want, regardless of the language they speak. And it will mean content creators can easily and authentically access a far bigger audience across the world.,” Staniszewski said in a statement.

The industry took notice. After an initial funding round, ElevenLabs earlier this year raised $80 million and elevated its total valuation to $1.1 billion. With the additional funding, the company developed a series of new products.

Voice Library, for example, is ElevenLabs’ feature for sharing unique voice profiles created using its Voice Design technology. These predesigned voice profiles allow users to select the voice that best suits their needs rather than creating one from scratch.

Another tool called VoiceLab allows users to clone voices from just a few short snippets of audio or create entirely new synthetic voices.

Projects, a tool for creating long-form spoken content, such as audiobooks and dialogue segments, with contextually aware synthetic or custom voices, was released in September 2023.

And just this past June, ElevenLabs released the ElevenLabs Reader App on iOS and Android, allowing users to listen to articles, PDFs, and ePubs with AI Voices on their phones. A month later, ElevenLabs released Voice Isolator, which removes background noise from audio.

In August, ElevenLabs expanded its voice generation capabilities to 28 languages. Using an in-house AI model, it automatically detects languages like Korean, Dutch, and Vietnamese, allowing for emotionally rich multilingual speech generation.

Also newsworthy was the number of other vendors that partnered with ElevenLabs to include its speech technologies in their products. Among them were Talkpush’s Sam, a generative AI-powered voice interviewer; Skyrocket’s Plai toy line; Adthos, which added ElevenLabs’ voices in 200 languages and dialects to its audio platform; UneeQ, which integrated ElevenLabs’ AI voices into its digital human animation platform, Synanim; Pocket FM, which teamed up with ElevenLabs to launch AI Audio Series; BlipCut, whose AI Video Translator offers ElevenLabs’ voice cloning, translation, subtitling, dubbing, and changing capabilities; and SensiML, whose Data Studio added a text-to-speech synthetic dataset generation feature from ElevenLabs.

But for all the positive sentiment around ElevenLabs, like so many other speech technology vendors, it couldn’t escape criticism after users were able to abuse its software to generate controversial statements in the vocal style of celebrities, public officials, and other famous individuals. And in January, ElevenLabs admitted that its platform has been used for voice cloning misuse and toughened its safeguards against malicious use of its technology.

In response, the company limited access to its voice cloning feature to paid subscribers and implemented bans on users who repeatedly violate the terms of service. It even partnered with Reality Defender, a cybersecurity company specializing in deepfake detection, to advance the development of audio deepfake detection models.

The partnership leverages ElevenLabs’ models and methods to improve Reality Defender’s detection tools.

“We are thrilled to partner with ElevenLabs, a pioneer in audio AI tools,” said Ben Colman, cofounder and CEO of Reality Defender, in a statement. “Their commitment to AI safety and responsible AI development aligns with our mission to detect and prevent deepfakes. This collaboration, a first for Reality Defender, will allow us to provide highly advanced audio detection that stays several steps ahead of bad actors and fraudsters.”

SpeechTek Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues