Growing market of smart phone vs voice recognition
The development of mobile devices and in particular smart phones started to grow totally new dynamic interactions between device and human like voice interactions. A technology called voice recognition is an inseparable part of these interactions.
Huge market of devices grows rapidaly and in 2019 was worth over USD 719 billion. The estimates for 2025 tell us that number of the smart phones will reach over USD 1351 billion (according to Mordor Intelligence report).
This means that the impact of all activities around voice recognition and making interactions with not only smart phones but also with other devices will also grow.
Services like Google Translate and companies like Google, Amazon, Apple have their own solutions for voice recognition. For all of them we need to pay and use only directly in smart phones. Other possibility is to connect remotely to their cloud services ( Google Cloud, Azure Cloud, Amazon Cloud).
There are some disadvantages of available solutions like:
- we cannot use speech recognition when there is not Internet access available
- we need to pay for every set of requests sent to cloud servers to get response from sound to text
- delay between our device and cloud servers you need to include in any solution using voice recognition
- big companies has voice recognition in very good quality that is dangerous from the perspective of all other smaller companies
- we cannot be sure that recorded voice samples during business meetings will not be used against the people and companies
Mozilla common voice initiative
Looking into all this problems and having in mind that nearly the whole market is controlled by some companies Mozilla Foundation started in 2018 an initiative to build common voice community.
The role of the community is to build huge datasets of sentences from almost all languages in the world. This digital library of recorded sounds with their known translations is available for free for everybody and like you can read from official voice.mozilla.org website:
“Common Voice is Mozilla’s initiative to help teach machines how real people speak.”
How you can help to build languages datasets
Using voice mozilla web portal everybody can help to build huge fully open source datasets of recorded voice samples. You can start without any special knowledge about this.
The procedures is very well described by Mozilla on the website:
- go to voice.mozilla.org
- click on bookmark Languages
- Find your language and click Contibute
- Click the microphone icon and start recording the sentence that will be displayed
For the moment of publishing the article in voice.mozilla.org there are registered nearly 50 languages. The datasets grow day by day and your action is required to help building open community and better world!!!
How to download voice datasets
Downloading voice datasets from voice.mozilla.org is probably even easier than recording voice. You need to go to Languages link in the main menu.
In the right side you can select langauge from the list and then scroll page down and click the link Enter email to download.
When you type email you should receive within some minutes link to the voice language dataset that you are interested in.
Mozilla voice team updates all languages datasets every half of the year: at the beggining and in the mid of year. Here is the official correspondence:
I can only say that mozilla team is very responsive and they will help you get answer for every questions you want to ask them.
Building voice sentences with sentence collector
Someone can ask that everything is clear, I can record voice sentences but how to help build sentences database and what are the general rules?
This side of the whole system is also well organize by Mozilla foundation. They create a tool that is name Common Voice Sentence Collector.
Registration to the tool is based on simple form with required login name, email and password. After that you can start creating your own sentences. Technically you can:
- add new sentenctes
- review existing ones (approve if they are correct to the rules)
- reject sentences
The general rule during playing with sentences is to accept only these sentences that use Public Domain based on creative common (CC-0) license. The kind of license gives us to freely modify, change and copy texts from existing songs and books without any additional restrictions and payment.
The list of all rules that you need to respect when you add new sentences and when you review existing ones you can find here: https://github.com/Common-Voice/sentence-collector/blob/master/doc/how-to.md.