Help to build common voice datasets with Mozilla

Share me please

Growing market of smart phone vs voice recognition

The development of mobile devices and in particular smart phones started to grow totally new dynamic interactions between device and human like voice interactions. A technology called voice recognition is an inseparable part of these interactions.

Huge market of devices grows rapidaly and in 2019 was worth over USD 719 billion. The estimates for 2025 tell us that number of the smart phones will reach over USD 1351 billion (according to Mordor Intelligence report).

This means that the impact of all activities around voice recognition and making interactions with not only smart phones but also with other devices will also grow.

Services like Google Translate and companies like Google, Amazon, Apple have their own solutions for voice recognition. For all of them we need to pay and use only directly in smart phones. Other possibility is to connect remotely to their cloud services ( Google Cloud, Azure Cloud, Amazon Cloud).

There are some disadvantages of available solutions like:

we cannot use speech recognition when there is not Internet access available
we need to pay for every set of requests sent to cloud servers to get response from sound to text
delay between our device and cloud servers you need to include in any solution using voice recognition
big companies has voice recognition in very good quality that is dangerous from the perspective of all other smaller companies
we cannot be sure that recorded voice samples during business meetings will not be used against the people and companies

Mozilla common voice initiative

Looking into all this problems and having in mind that nearly the whole market is controlled by some companies Mozilla Foundation started in 2018 an initiative to build common voice community.

The role of the community is to build huge datasets of sentences from almost all languages in the world. This digital library of recorded sounds with their known translations is available for free for everybody and like you can read from official voice.mozilla.org website:

“Common Voice is Mozilla’s initiative to help teach machines how real people speak.”

How you can help to build languages datasets

Using voice mozilla web portal everybody can help to build huge fully open source datasets of recorded voice samples. You can start without any special knowledge about this.

The procedures is very well described by Mozilla on the website:

go to voice.mozilla.org
click on bookmark Languages
Find your language and click Contibute
Click the microphone icon and start recording the sentence that will be displayed

For the moment of publishing the article in voice.mozilla.org there are registered nearly 50 languages. The datasets grow day by day and your action is required to help building open community and better world!!!

How to download voice datasets

Downloading voice datasets from voice.mozilla.org is probably even easier than recording voice. You need to go to Languages link in the main menu.

In the right side you can select langauge from the list and then scroll page down and click the link Enter email to download.

When you type email you should receive within some minutes link to the voice language dataset that you are interested in.

Mozilla voice team updates all languages datasets every half of the year: at the beggining and in the mid of year. Here is the official correspondence:

I can only say that mozilla team is very responsive and they will help you get answer for every questions you want to ask them.

Building voice sentences with sentence collector

Someone can ask that everything is clear, I can record voice sentences but how to help build sentences database and what are the general rules?

This side of the whole system is also well organize by Mozilla foundation. They create a tool that is name Common Voice Sentence Collector.

Registration to the tool is based on simple form with required login name, email and password. After that you can start creating your own sentences. Technically you can:

add new sentenctes
review existing ones (approve if they are correct to the rules)
reject sentences

The general rule during playing with sentences is to accept only these sentences that use Public Domain based on creative common (CC-0) license. The kind of license gives us to freely modify, change and copy texts from existing songs and books without any additional restrictions and payment.

The list of all rules that you need to respect when you add new sentences and when you review existing ones you can find here: https://github.com/Common-Voice/sentence-collector/blob/master/doc/how-to.md.

3 Comments

AffiliateLabz 15th February 2020 Reply

Great content! Super high-quality! Keep it up! 🙂
Charlena Wiste 10th March 2020 Reply

Hello.This article was extremely interesting, especially since I was looking for thoughts on this topic last Monday.
Alexandria Hibdon 2nd April 2020 Reply

Hiya, I’m really glad I’ve found this information. Nowadays bloggers publish only about gossip and net stuff and this is actually frustrating. A good site with interesting content, that’s what I need. Thanks for making this web-site, and I will be visiting again. Do you do newsletters by email?