Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
This paper investigates a factor analysis scheme in the joint channel space of stereo-based stochastic mapping (SSM) for noise robust automatic speech recognition. A mixture of Bayesian factor analyzers is used to describe the generative factors in the multi-conditional training scenario in terms of noise type and signal-to-noise ratio. Sparsity-promoting prior is applied on the matrix of factor loadings to automatically learn the effective factors from a redundant dictionary in a particular soft cluster. Experiments carried out on large vocabulary continuous speech recognition tasks show that this sparse Bayesian factor analysis scheme leads to superior SSM performance for noise robustness.
Michael Picheny, Zoltan Tuske, et al.
INTERSPEECH 2019
George Saon, Tom Sercu, et al.
INTERSPEECH 2016
George Saon
SLT 2014
Thomas Bohnstingl, Ayush Garg, et al.
ICASSP 2022