E. Eide, B. Maison, et al.
ICSLP 2000
Fixed-rate feature extraction which is used in most current speech recognizers is equivalent to sampling the feature trajectories at a uniform rate. Often this sampling rate is well below the Nyquist rate and thus leads to distortions in the sampled feature stream due to aliasing. In this paper we explore various techniques, ranging from simple cepstral and spectral smoothing to filtering and data-driven dimensionality expansion using Linear Discriminant Analysis (LDA), to counter aliasing and the variable rate nature of information in speech signals. Smoothing in the spectral domain results in a reduction in the variance of the short term spectral estimates which directly translates to reduction in the variances of the Gaussians in the acoustic models. With these techniques we obtain modest improvements, both in word error rate and robustness to noise, on large vocabulary speech recognition tasks.
E. Eide, B. Maison, et al.
ICSLP 2000
R.A. Gopinath
ICASSP 1996
Ellen M. Eide, Lalit R. Bahl
ICSLP 1998
L.R. Bahl, S. De Gennaro, et al.
ICSLP 1998