CMU Sphinx Toolkit is actively used in speech recognition research. To note some, here is the list of publications it's worth to mention
Ziad Al Bawab, An Analysis-by-Synthesis Approach to Vocal Tract Modeling for Robust Speech Recognition, Ph.D. Thesis, ECE Department, CMU, September, 2009.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
L. Buera, A. Miguel, A. Ortega, E. Lleida, and R. Stern, “Unsupervised training scheme with non-stereo data for empirical feature vector compensation, Interspeech 2009, September 2009, Brighton, United Kingdom.
-
-
-
-
-
Z. Al Bawab, B, Raj, and R. M. Stern, “
Analysis-by-synthesis features for speech recognition ,” IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2008, Las Vegas, Nevada.
-
-
-
-
-
K. Kumar, T. Chen, and R. M. Stern, “
Profile view lip reading ,” IEEE International Conference on Acoustics, Speech, and Signal Processing, April 2007, Honolulu, Hawaii.
-
-
R. M. Stern, DeL. Wang, and G. Brown, “
Binaural sound localization ,” Chapter in Computational Auditory Scene Analysis, G. Brown and DeL. Wang, Eds., Wiley/IEEE Press, 2006.
-
-
-
-
-
-
N.S. Kim, W. Lim, and R. M. Stern, “Feature compensation based on switching linear dynamic model,” IEEE Signal Processing Letters, 12 (6): 473-476, June, 2005.
-
B. Raj and R. Singh, “Classifier-Based Non-Linear Projection for Adaptive Endpointing of Continuous Speech,” Computer Speech and Language 17(1):5-26, January 2003.
-
-
-
-
-
-
-
R. Singh, R. M. Stern, and B. Raj, “Signal and Feature Compensation Methods for Robust Speech Recognition,” Chapter in CRC Handbook on Noise Reduction in Speech Applications, Gillian Davis, Ed. CRC Press, 2002.
R. Singh, B. Raj, and R. M. Stern, “Model Compensation and Matched Condition Methods for Robust Speech Recognition,” Chapter in CRC Handbook on Noise Reduction in Speech Applications, Gillian Davis, Ed. CRC Press, 2002.
M. L. Seltzer, B. Raj, and R. M. Stern,
“Speech Recognizer-Based Microphone Array Processing for Robust Hands-Free Speech Recognition,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., May, 2002, Orlando, Florida.
-
-
R. Singh, M. L. Seltzer, B. Raj, and R. M. Stern,
“Speech in Noisy Environments: Robust Automatic Segmentation, Feature Extraction, and Hypothesis Combination,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., May, 2001, Salt Lake City, Utah.
-
D. P. W. Ellis, R. Singh, and S. Sivadas,
“Tandem Acoustic Modeling in Large-Vocabulary Recognition,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., May, 2001, Salt Lake City, Utah.
-
-
-
-
-
-
-
-
-
-
R. Singh, B. Raj, and R. M. Stern,
“Automatic Generation of Phone Sets and Lexical Transcriptions,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., June, 2000, Istanbul, Turkey.
-
R. M. Stern, A. Acero, F.-H. Liu, and Y. Ohshima,
“Signal Processing for Robust Speech Recognition,” Chapter in Speech Recognition, pp. 351-378, C.-H. Lee and F. Soong, Eds., Boston: Kluwer Academic Publishers, 1996.
-
-
E. B. Gouvea, P. J. Moreno, B. Raj, T. M. Sullivan, and R. M. Stern,
“Adaptation and Compensation: Approaches To Microphone And Speaker Independence In Automatic Speech Recognition,” Proceedings of the ARPA Workshop on Speech Recognition Technology, Harriman, NY, Morgan Kaufmann, D. Pallett, Ed.
U. Jain, M. A. Siegler, S.-J. Doh, E. Gouvea, P. J. Moreno, B. Raj, and R. M. Stern,
“Recognition Of Continuous Broadcast News With Multiple Unknown Speakers And Environments,” Proceedings of the ARPA Workshop on Speech Recognition Technology, Harriman, NY, Morgan Kaufmann, D. Pallett, Ed.
-
-
-
-
-
P. J. Moreno, B. Raj, and R. M. Stern,
"Approaches to Environment Compensation in Automatic Speech Recognition," Proc. 15th International Conference on Acoustics, Trondheim, Norway, Vol. III, pp. 109-112, June, 1995.
-
-
-
-
-
-
F.-H. Liu, P. J. Moreno, R. M. Stern, and A. Acero,
“Signal Processing For Robust Speech Recognition,” Proceedings of the Seventh ARPA Workshop on Human Language Technology, Princeton, New Jersey, Morgan Kaufmann, C. J. Weinstein, Ed.
F.-H. Liu, P. J. Moreno, R. M. Stern, and A. Acero, “
Signal Processing For Robust Speech Recognition ,” Proceedings of the ARPA Workshop on Spoken Language Technology, Princeton, New Jersey, March, 1994, R. M. Stern, Ed.
-
F.-H. Liu, R. M. Stern, X. Huang, and A. Acero,
"Efficient Cepstral Normalization For Robust Speech Recognition," Proc. of the Sixth ARPA Workshop on Human Language Technology, Princeton, NJ, Morgan Kaufmann, March, 1993.
R. M. Stern, F.-H. Liu, Y. Ohshima, T. M. Sullivan, and A. Acero,
"Multiple Approaches to Robust Speech Recognition," Proc. of the Fifth DARPA Speech and Natural Language Workshop, Harriman, New York, February, 1992.
-
-
-
A. Acero, and R. M. Stern,
“Toward Microphone-Independent Spoken Language Systems,” Proceedings of the DARPA Speech and Natural Language Workshop , Hidden Valley, PA, R. M. Stern , Ed., Morgan Kaufmann Publishers, Inc., San Mateo, CA, 1990.
-
Original description of extended maximum a posteriori probability (EMAP) speaker adaptation:
-
M. J. Lasry and R. M. Stern, “A Posteriori Estimation of Correlated Jointly Gaussian Mean Vectors,” IEEE Trans. on Pattern Anal. and Mach. Intel. 6: 530-535, 1984.
M. J. Lasry and R. M. Stern, “Unsupervised Adaptation to New Speakers in Feature-Based Letter Recognition,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., San Diego, California, May, 1984.
R. M. Stern and M. J. Lasry (1983). “Dynamic Speaker Adaptation for Isolated Letter Recognition Using MAP Estimation,” Proc. IEEE Conf. on Acoustics, Speech, and Sig. Proc., Boston, Massachusetts, May, 1983.