Email address protected by JavaScript.
Please enable JavaScript to contact me.

The CMU Sphinx Group Open Source Speech Recognition Engines

Speech at CMU   |   Sphinx at SourceForge

Introduction

General Documentation

CMUSphinx Components

Common library

Decoders

Acoustic Model Training

Language Model Training

Utilities


Latest News


External Links

Notice: if you have comments about the links below, please contact the authors directly.

Welcome to the CMU Sphinx project page!

The Sphinx Group at Carnegie Mellon University is committed to releasing the long-time, DARPA-funded Sphinx projects widely, in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis.

The Sphinx Group has been supported for many years by funding from the Defense Advanced Research Projects Agency, and the recognition engines to be released are those that the group used for the various DARPA projects and their respective evaluations.

Recent support for the project also include Telefónica I & D, Sun Microsystems, and Mitsubishi Electric Research Labs.

The licensing terms for the Sphinx engines and tools are derived from BSD, and based, in particular, upon the license for the Apache web server. There is no restriction against commercial use or redistribution. (License terms for CMU Sphinx)

The packages that the CMU Sphinx Group is releasing are a set of reasonably mature, world-class speech components that provide a basic level of technology to anyone interested in creating speech-using applications without the once-prohibitive initial investment cost in research and development; the same components are open to peer review by all researchers in the field, and are used for linguistic research as well.

Note however that Sphinx is not a final product. Those with a certain level of expertise can achieve great results with the versions of Sphinx available here, but a naive user will certainly need further help. In other words, the software available here is not meant for users with no experience in speech, but for expert users.

This site will be the canonical location for the release of the Sphinx trainers, recognizers, acoustic and language models, and documentation.

Try a System

If you'd like to have a chance to try out  an application that uses CMU Sphinx, try one of these.

  • Roomline, a system that handles conference room reservations within CMU. You can reach it at the toll-free number 1-877-CMU-PLAN (1-877-268-7526) or at +1 412 268 1084.

    Note that your call will be recorded for development purposes and may be shared with other researchers. We don't have a policy set up yet for placing such recordings into a publicly availably database, and so there is no guarantee that this data will become publicly available -- though we're motivated to set that up in the future.

  • Let's Go, a spoken dialog system for the general public. The Let's Go! project is working in the domain of bus information, providing information such as schedules and route information for the city of Pittsburgh's Port Authority Transit (PAT) buses. You can interact with a version of this system right now by calling 412-268-3526 (requires some knowledge of Pittsburgh's transit system).

Bug Tracking and Discussion Groups

There are fora for bug tracking and discussions on the SourceForge site, also. Please go there for help, questions, to report bugs, and to see the latest work. The work is currently pre-version 1.0, so there is a lot yet to be done.

There is also an IRC channel (#cmusphinx on irc.freenode.net) for real-time discussion.

Platforms

  • GNU/Linux, Unix variants, and Windows NT or later
SourceForge.net Logo This page is maintained by David Huggins-Daines ()
CMUSphinx is a project within the Sphinx Group at Carnegie Mellon