JSALT 2015 -- Week 5 Informal Seminar
with Vikramjit Mitra

Monday, Aug. 3:

Improving Robustness of Speech Applications

Vikramjit Mitra

Automated speech applications (ASA) such as speech recognition, language/speaker identification and keyword spotting are crucial to automated speech analysis techniques for data analytics, voice operated systems, biometrics and information triage. Current ASA systems perform reliably under controlled/clean acoustic conditions but degrade rapidly with minor environmental changes. For real-world applications ASA systems must perform reliably under greatly varying environmental conditions that are part of our everyday life. In particular, systems continue to be highly sensitive to acoustic variations due to noise, reverberation, different receiving-transmitting devices, multiple-speakers etc. In this talk I will present SRI’s recent work to overcome the impact on ASA performance due to data mismatches, background distortions, etc. by exploiting findings in speech perception research. I will introduce some of the recently proposed robust acoustic feature generation techniques, followed by some analysis on robust modeling strategies and present results on some benchmark datasets.


Vikram is a research engineer in SRI International's Speech Technology and Research Laboratory. He received his PhD in Electrical Engineering from University of Maryland, College Park. His research interests include signal processing for noise/channel/reverberation, speech recognition, production/perception-motivated signal processing, information retrieval, machine learning, and speech analytics. He has worked on several projects funded by the DARPA, IARPA, AFRL, NSF, Sandia National Laboratories, and others. He is a senior member of the IEEE and an affiliate member of the SLTC, and he has served on NSF panels and on the scientific committees of SPASR2013 and MLSLP2012.

