A language independent approach to audio search
Vikram Gupta, Jitendra Ajmera, et al.
INTERSPEECH 2011
This paper presents the endeavors to improve the performance of large vocabulary continuous speech recognition (LVCSR) in speechto- speech translation system on smart phones. A variety of techniques towards high LVCSR performance are investigated to achieve high accuracy and low latency given constrained resources. This includes one-pass streaming mode decoding for minimum latency, acoustic modeling with full-covariance based on bootstrap and model restructuring for improving recognition accuracy with limited training data; quantized discriminative feature space transforms and quantized Gaussian mixture model to reduce memory usage with negligible degradation on recognition accuracy. Some speed optimization methods are also discussed to increase the recognition speed. The proposed techniques evaluated on the DARPA Transtac datasets will be shown to give good overall performance under the constraints of both CPU and memory on smart phones. Copyright © 2011 ISCA.
Vikram Gupta, Jitendra Ajmera, et al.
INTERSPEECH 2011
Christoph Tillmann, Sanjika Hewavitharana
INTERSPEECH 2011
Michelle Brachman, Zahra Ashktorab, et al.
PACM HCI
Gang Wang, Fei Wang, et al.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics