Contextual revision in information seeking conversation systems
Keith Houck
ICSLP 2004
We present a rapid compensation technique aimed at reducing the detrimental effect of environmental noise and channel on server based mobile speech recognition. It solves two key problems for such systems: firstly how to accurately separate non-speech events (or background noise) from noise introduced by network artifacts; secondly how to reduce the latency created by the extra computation required for a codebook-based linear channel compensation technique. We address the first problem by modifying an existing energy based endpoint-detection algorithm to provide segmenttype information to the compensation module. We tackle the latency issue with a codebook based scheme by employing a tree structured vector quantization technique with dynamic thresholds to avoid the computation of all codewords. Our technique is evaluated using a speech-in-car database at 3 different speeds. Our results show that our method leads to a 8.7% reduction in error rate and 35% reduction in computational cost.
Keith Houck
ICSLP 2004
Ashutosh Garg, Sreeram Balakrishnan, et al.
ICASSP 2004
Youssef Mroueh, Etienne Marcheret, et al.
AISTATS 2017
Juan M. Huerta, Cheng Wu, et al.
INTERSPEECH 2009