Semantic structured language models
Hakan Erdogan, Ruhi Sarikaya, et al.
ICSLP 2002
Unit selection based concatenative speech synthesis has proven to be a successful method of producing high quality speech output. However, in order to produce high quality speech, large speech databases are required. For some applications, this is not practical due to the complexity of the database search process and the storage requirements of such databases. In this paper, we propose a data-driven algorithm to reduce the database size used in concatenative synthesis. The algorithm preselects database speech segments based on statistics collected by synthesizing a large number of sentences using the full speech database. The algorithm is applied to the IBM trainable speech synthesis system and the results show that database size can be reduced substantially while maintaining the output speech quality.
Hakan Erdogan, Ruhi Sarikaya, et al.
ICSLP 2002
Jennifer C. Lai, Kwan Min Lee
ICSLP 2002
G. Zweig, G. Saon, et al.
ICSLP 2002
Brian Kingsbury, Pratibha Jain, et al.
ICSLP 2002