Bowen Zhou, Bing Xiang, et al.
SSST 2008
In generic automatic speech recognition (ASR) systems, typically, language models (LMs) are trained to work within a broad range of input conditions. ASR systems used in domainspecific spoken dialogue systems (SDSs) are more constrained in terms of content and style. A mismatch in content and/or style between training and operating conditions results in performance degradation for the dialogue application. The main focus of this paper is to develop tools to facilitate rapid development of spoken dialogue applications within the context of language model training by focusing on the problem of automatically collecting text data that is useful to train accurate language models for the new target domain without manually collecting any in-domain data. We investigate a framework to extract useful information from previous domains and World Wide Web (WWW). We collect data by submitting queries to a search engine and then clean the resulting text via syntactic and semantic filtering. This is followed by artificial sentence generation. Without using any in-domain data, our system achieved a word error rate (WER) of 19.33%, a performance comparable to that achieved by a language model trained on manually collected 32K in-domain sentences. Using less than 1% of in-domain data along with the automatically generated text, our system achieved an ASR performance close to a language model trained on 60K in-domain sentences.
Bowen Zhou, Bing Xiang, et al.
SSST 2008
Miroslav Novak
INTERSPEECH - Eurospeech 2005
Ruhi Sarikaya, Yuqing Gao, et al.
ICASSP 2004
Fu-Hua Liu, Yuqing Gao
ISCSLP 2004