Heloisa Caroline de Souza Pereira Candello, Claudio Santos Pinhanez, et al.
CSCW 2022
This paper examines the application of linear transformations for speaker and environmental adaptation in an HMM-based speech recognition system. In particular, transformations that are trained in a maximum likelihood sense on adaptation data are investigated. Only model-based linear transforms are considered, since, for linear transforms, they subsume the appropriate feature-space transforms. The paper compares the two possible forms of model-based transforms: (i) unconstrained, where any combination of mean and variance transform may be used, and (ii) constrained, which requires the variance transform to have the same form as the mean transform. Re-estimation formulae for all appropriate cases of transform are given. This includes a new and efficient full variance transform and the extension of the constrained model-space transform from the simple diagonal case to the full or block-diagonal case. The constrained and unconstrained transforms are evaluated in terms of computational cost, recognition time efficiency, and use for speaker adaptive training. The recognition performance of the two model-space transforms on a large vocabulary speech recognition task using incremental adaptation is investigated. In addition, initial experiments using the constrained model-space transform for speaker adaptive training are detailed. © 1998 Academic Press Limited.
Heloisa Caroline de Souza Pereira Candello, Claudio Santos Pinhanez, et al.
CSCW 2022
Rajesh Balchandran, Leonid Rachevsky, et al.
INTERSPEECH 2009
Claudio Santos Pinhanez, Edem Wornyo
CHI 2025
Luís Henrique Neves Villaça, Sean Wolfgand Matsui Siqueira, et al.
SBSI 2023