Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
In this paper, we describe a phrase-based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models. The units of translation are blocks – pairs of phrases. During decoding, we use a block unigram model and a word-based trigram language model. During training, the blocks are learned from source interval projections using an underlying high-precision word alignment. The system performance is significantly increased by applying a novel block extension algorithm using an additional high-recall word alignment. The blocks are further filtered using unigram-count selection criteria. The system has been successfully test on a Chinese-English and an Arabic-English translation task.
Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
Pradip Bose
VTS 1998
Raymond Wu, Jie Lu
ITA Conference 2007
Ehud Altman, Kenneth R. Brown, et al.
PRX Quantum