Vittorio Castelli, Lawrence Bergman
IUI 2007
Universal background models (UBM) in speaker recognition systems are typically Gaussian mixture models (GMM) trained from a large amount of data using the maximum likelihood criterion. This paper investigates three alternative criteria for training the UBM. In the first, we cluster an existing automatic speech recognition (ASR) acoustic model to generate the UBM. In each of the other two, we use statistics based on the speaker labels of the development data to regularize the maximum likelihood objective function in training the UBM. We present an iterative algorithm similar to the expectation maximization (EM) algorithm to train the UBM for each of these regularized maximum likelihood criteria. We present several experiments that show how combining only two systems outperforms the best published results on the English telephone tasks of the NIST 2008 speaker recognition evaluation.
Vittorio Castelli, Lawrence Bergman
IUI 2007
Michael Heck, Masayuki Suzuki, et al.
INTERSPEECH 2017
Fan Zhang, Junwei Cao, et al.
IEEE TETC
Jean McKendree, John M. Carroll
CHI 1986