Forensically inspired approaches to automatic speaker recognition

Kyu Han; Mohamed Kamal Omar; Jason Pelecanos; Cezar Pendus; Sibel Yaman; Weizhong Zhu

doi:10.1109/ICASSP.2011.5947519

ICASSP 2011

Conference paper

18 Aug 2011

Forensically inspired approaches to automatic speaker recognition

View publication

Abstract

This paper presents ongoing research leveraging forensic methods for automatic speaker recognition. Some of the methods forensic scientists employ include identifying speaker distinctive audio segments and comparing these segments using features such as pitch, formant, and other information. Other approaches have also involved performing a phonetic analysis to recognize idiolectal attributes, and an implicit analysis of the demographics of speakers. Inspired by these forensic phonetic approaches, we target three threads of work; hot-spot analysis, speaker style and pronunciation modelling, and demographics analysis. As a result of this work we show that a phonetic analysis conditioned on select speech events (or hot-spots) can outperform a phonetic analysis performed over all speech without conditioning. In the area of pronunciation modelling, one set of results demonstrate significantly improved robustness by exploiting phonetic structure in an automatic speech recognition system. For demographics analysis, we present state-of-the-art results of systems capable of detecting dialect, non-nativeness and native language. © 2011 IEEE.

Conference paper