Tied-mixture language modeling in continuous space

Ruhi Sarikaya; Mohamed Afify; Brian Kingsbury

doi:10.3115/1620754.1620821

NAACL-HLT 2009

Conference paper

31 May 2009

Tied-mixture language modeling in continuous space

View publication

Abstract

This paper presents a new perspective to the language modeling problem by moving the word representations and modeling into the continuous space. In a previous work we introduced Gaussian-Mixture Language Model (GMLM) and presented some initial experiments. Here, we propose Tied-Mixture Language Model (TMLM), which does not have the model parameter estimation problems that GMLM has. TMLM provides a great deal of parameter tying across words, hence achieves robust parameter estimation. As such, TMLM can estimate the probability of any word that has as few as two occurrences in the training data. The speech recognition experiments with the TMLM show improvement over the word trigram model. © 2009 Association for Computational Linguistics.

Conference paper