Tasks for Summer Of Code Projects

This is a preliminary list of tasks for the SummerOfCode project. If you have some other idea, feel free to add it. For any questions contact us on cmusphinx-devel@lists.sourceforge.net mailing list or on #cmusphinx irc channel on freenode. See also Information for Students.

Implement CMUSphinx backend in Simon

Task

Simon http://simon-listens.org is a dialog managment system for Linux desktop. It is an open-source speech recognition program and replaces the mouse and keyboard. It's designed to be very flexible and allows customization for any application where speech recognition is needed. It currently runs using HTK but it has a flexible architecture and able to plug any other engines. You need to implement a simon backend to recognize speech with pocketsphinx.

Complexity

Easy

Mentor

Peter Grasch

Skills

C++, Qt, KDE

Note

This was accepted for GSoC 2012 by the KDE organization.

L2S Rules in sphinx4

Task

Currently sphinx4 can only work with predefined dictionary. It's possible to build phonetic dictionary automatically but it requires both application of machine learning for training and development of decoder module as well as testing. Various language modules needs to be trained as well. This work will be implement letter to sound rules with OpenFST in sphinx4.

Reading

Sittichai Jiampojamarn, Colin Cherry and Grzegorz Kondrak. “Joint Processing and Discriminative Training for Letter-to-Phoneme Conversion”. In Proceeding of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-08: HLT), Columbus, OH, June 2008, pp.905-913.

http://code.google.com/p/directl-p/

M. Bisani and H. Ney. “Joint-Sequence Models for Grapheme-to-Phoneme Conversion”. Speech Communication, Volume 50, Issue 5, May 2008, Pages 434-451

http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html

Complexity

Medium

Mentor

Nickolay V. Shmyrev

Skills

Java, Python, C++, Machine learning

Note

Accepted for GSoC 2012

Pronunciation Evaluation

Task

Implement the simple reading and pronunciation learning system

More Information

http://cmusphinx.sourceforge.net/wiki/faq#qhow_to_implement_pronunciation_evaluation

Complexity

Easy

Mentor

James Salsman (jpsalsman@talknicer.com)

Skills

C, Perl, Javascript, Actionscript/Flex4, statistics

Note

Accepted for GSoC 2012: Srikanth Ronanki and Troy Lee

* Project blog

Semantic language model

Current language models are very basic that means they don't really understand what's transcribed. That affects error rate. Create a decoder over the lattices that will select semantically correct path and create a perfectly readable result.

Reading

Integrating Word Relationships into Language Models Guihong Cao, Jian-Yun Nie, Jing Bai

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.94.7443&rep=rep1&type=pdf

Mentor

Bhiksha Raj

Complexity

Hard

Skills

Java, C, Machine learning

Note

Accepted for GSoC 2012

Postprocessing framework

Create language-independent postprocessing framework that will turn ASR results into something readable with punctuation, abbreviations and capitalization.

Reading

SENTENCE SEGMENTATION AND PUNCTUATION RECOVERY FOR SPOKEN LANGUAGE TRANSLATION Matthias Paulik, Sharath Rao1, Ian Lane1, Stephan Vogel1 and Tanja Schultz

http://www.makapa.de/Paulik_Sent_ICASSP08.pdf

Mentor

Elias Majic

Complexity

Hard

Skills

Java, Machine learning

Note

Accepted for GSoC 2012

Web Data Collection For Language Modeling

Write a crawler which can collect text data for language model training on certain topic

Reading

Web augmentation of language models for continuous speech recognition of SMS text messages

http://www.aclweb.org/anthology/E/E09/E09-1019.pdf

Mentor

Tony Robinson

Complexity

Hard

Skills

Java. Web technologies.

Note

Accepted for GSoC 2012

Implement Ephraim-Malach and Kalman noise cancellation filters

Implement Epraim-Malah and Kalman noise cancellation filters in pocketsphinx or sphinx4.

Reading

A modified Ephraim-Malah noise suppression rule for automatic speech recognition. Gemello R, F. Mana, R. De Mori.

Mentor

Nickolay V. Shmyrev

Complexity

Easy

Skills

C or Java, Signal processing.

 
summerofcodeideas.txt · Last modified: 2012/04/25 05:38 by jpsalsman
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki