Speech codec optimization based on cell broadband engine
Zhenbo Zhu, Qing Wang, et al.
ICASSP 2007
A novel distributed language model that has no constraints on the n-gram order and no practical constraints on vocabulary size is presented. This model is scalable and allows for an arbitrarily large corpus to be queried for statistical estimates. Our distributed model is capable of producing n-gram counts on demand. By using a novel heuristic estimate for the interpolation weights of a linearly interpolated model, it is possible to dynamically compute the language model probabilities. The distributed architecture follows the client-server paradigm and allows for each client to request an arbitrary weighted mixture of the corpus. This allows easy adaptation of the language model to particular test conditions. Experiments using the distributed LM for re-ranking N-best lists of a speech recognition system resulted in considerable improvements in word error rate (WER), while integration with a machine translation decoder resulted in significant improvements in translation quality as measured by the BLEU score. © 2007 IEEE.
Zhenbo Zhu, Qing Wang, et al.
ICASSP 2007
Vadim Sheinin, Da-Ke He
ICASSP 2007
Mohamed Kamal Omar, Lidia Mangu
ICASSP 2007
Hagen Soltau, George Saon, et al.
IEEE Transactions on Audio, Speech and Language Processing