Dynamic features in the linear-logarithmic hybrid domain for automatic speech recognition in a reverberant environment

Osamu Ichikawa; Takashi Fukuda; Masafumi Nishimura

doi:10.1109/JSTSP.2010.2057191

IEEE JSTSP

Paper

01 Oct 2010

Dynamic features in the linear-logarithmic hybrid domain for automatic speech recognition in a reverberant environment

View publication

Abstract

Static and dynamic features using Mel frequency cepstral coefficients (MFCCs) are widely used in automatic speech recognition. Since the MFCCs are calculated from logarithmic spectra, the delta and delta-delta are considered to be difference operations in the logarithmic domain. In a reverberant environment, speech signals have late reverberations, whose power is plotted as a long-term exponential decay. This tends to cause the logarithmic delta to keep the constant value for a long time. This paper considers new schemes for calculating delta and delta-delta features that quickly diminish in the reverberant segments. Experiments using the evaluation framework for reverberant environments (CENSREC-4) showed significant improvements by simply replacing the MFCC dynamic features with the proposed dynamic features. © 2010 IEEE.

Conference paper