Pocketsphinx as standalone app on Android wearables

March 3rd, 2017

With the launch of Android Wear’s new version 2.0 now it is possible to run standalone apps on wearables – indepentent of a phone.

Wearable Screen

The PocketSphinx Demo app for Android includes continuous listening for the keyphrase “oh mighty computer” and once that keyphrase has been recognized, it switches to grammar mode to let you input some information. And now, thanks to the contribution of Mathias Lenz, the demo app has been extended by a module to make it available as a standalone app on wearables running Android Wear 2.0.

To explore the source code and test and run it, you can just download the project from our GitHub, import it to your Android Studio, run it in an emulator and build an APK and deploy it onto a wearable device running Android Wear 2.0. Let us know how it works.

CMUSphinx at GSOC 2017

February 28th, 2017

After several years break we are pleased to announce that CMUSphinx project is accepted to Google Summer Of Code 2017 program. That will enable us to help several students to start their way in speech recognition, open source development and in CMUSphinx. We are really excited about that. See the organization page for details.

This year will be more focused on pronunciation evaluation, two major pronunciation tasks will be our preference. If you are interested to participate as a student, an application period will open soon but it’s better to start preparation of your application right now. Feel free to contact us for any questions, but do your own googling before asking very simple things! For more details see:

http://cmusphinx.sourceforge.net/wiki/summerofcodestudents

If you would like to be a mentor please sign in into gsoc web application and add your ideas to the ideas list:

http://cmusphinx.sourceforge.net/wiki/summerofcodeideas

We invite you to participate!

New 27k words 70h German model released

September 5th, 2016

Guenter Bartsch writes us:

The latest release of my audio models built from voxforge submissions is up to 70 hours of audio and 27k dictionary entries, available for download here.

This release includes:

  • A CMU Sphinx audio model
  • Several Kaldi models (still very experimental)
  • A Sequitur g2p model
  • Language models created using cmuclmtk and srilm

For the first time, the audio models include small portions of openpento und german-speechdata-package-v2.tar.gz – reviewing and transcribing those is quite laborious, so it will take some time until they are fully reviewed and integrated into the models. Also note that this model includes more distant-microphone recordings than older releases which means the word error rate has increased accordingly.

It is amazing more and more languages get accurate speech recognition support in CMUSphinx. While you might think a project might support a variety of languages, in practice without local person it is very hard to train a good database. Simply because you do not know where to take audio for training. A local person is needed to evaluate recognition results too. For example Spanish has half a billion speakers around the world, while we still have no good resources to train Spanish models.

So we encourage you once again to build the models for your own language, to collect transcribed speech, to contribute to Voxforge. Only joined effort will enable really good coverage of languages in speech recognition.