Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
Building multiple automatic speech recognition (ASR) systems and combining their outputs using voting techniques such as ROVER is an effective technique for lowering the overall word error rate. A successful system combination approach requires the construction of multiple systems with complementary errors, or the combination will not outperform any of the individual systems. In general, this is achieved empirically, for example by building systems on different input features. In this paper, we present a systematic approach for building multiple ASR systems in which the decision tree state-tying procedure that is used to specify context-dependent acoustic models is randomized. Experiments carried out on two large vocabulary recognition tasks, MALACH and DARPA EARS, illustrate the effectiveness of the approach. © 2005 IEEE.
Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
Po-Sen Huang, Haim Avron, et al.
ICASSP 2014
Bhuvana Ramabhadran, Jing Huang, et al.
INTERSPEECH - Eurospeech 2003
Asaf Rendel, Raul Fernandez, et al.
ICASSP 2016