Mozilla Announces Open Source Common Voice Speech Recognition Datasets

Mozilla has announced an expansion to its crowd-sourced Common Voice project. The Common Voice Project, which is just about a year old, is creating an open source voice-recognition dataset. Now the project is opening up to include more languages. Mozilla wants volunteers from across the globe to record short bits of text with their voice through a web or mobile app.

According to VentureBeat, “Mozilla launched the first fruits of its Common Voice datasets in English back in November, a collection that contained some 500 hours of speech and constituted 400,000 recordings from 20,000 individuals. Today, Mozilla officially kick starts the process of collecting voice data for three more languages — French, German, and — a little randomly — Welsh. Another 40 tongues are currently being prepped for the data collection process, with the likes of Brazilian Portuguese, Chinese (Taiwan), Indonesian, Polish, and Dutch already halfway toward being ready to start crowdsourcing voice data.”

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Mozilla Announces Open Source Common Voice Speech Recognition Datasets

Ethical Implications of Voice Generation

Driving Speech Technology Trends with AI

More Web Events

Hona Launches Voice AI

AI Virtual Assistants Market to Hit $2.45 Billion by 2030

Wispr Launches Wispr Flow for Windows

Microsoft Releases .NET MAUI Toolkit V. 11 with Offline Speech Recognition