Exploring features from natural language generation for prosody modeling

Shimei Pan; Kathleen McKeown; Julia Hirschberg

doi:10.1016/S0885-2308(02)00022-0

Computer Speech and Language

Paper

01 Jan 2002

Exploring features from natural language generation for prosody modeling

View publication

Abstract

Prosody modeling is critical in developing a Concept-to-Speech (CTS) system where both Natural Language Generation (NLG) and Speech Synthesis are used to automatically generate natural, coherent speech. In this paper, we empirically verify the usefulness of various natural language features in prosody modeling. Three groups of features are investigated: semantic, syntactic, and surface features produced by SURGE, a general-purpose surface natural language generator for English, deep semantic, and discourse features that are available during the domain modeling and content planning phases of generation, and information-based measures statistically derived from text. Our experiments identify which of this large set of features are effective in prosody modeling. This work represents an important step towards building a comprehensive prosody model for CTS systems that employ general NLG. This investigation is conducted in the context of MAGIC, a medical application that involves automatic speech and graphics generation. © 2002 Elsevier Science Ltd. All rights reserved.

Conference paper

A distributed architecture for fast SGD sequence discriminative training of DNN acoustic models

George Saon

SLT 2014

Workshop

New Frontiers of Human-centered Explainable AI (HCXAI): Participatory Civic AI, Benchmarking LLMs and Hallucinations for XAI, and Responsible AI Audits

Upol Ehsan, Elizabeth Watkins, et al.

CHI 2025

Workshop

Tools for Thought: Understanding, Protecting, and Augmenting Human Cognition with Generative AI - From Vision to Implementation

Zelun Tony Zhang, Nick Von Felten, et al.

CHI 2026

Conference paper

L ₁ vs. L ₂ regularization in text classification when learning from labeled features

Sînziana Mazilu, José Iria

ICMLA 2011

View all publications

Abstract

Related

A distributed architecture for fast SGD sequence discriminative training of DNN acoustic models

New Frontiers of Human-centered Explainable AI (HCXAI): Participatory Civic AI, Benchmarking LLMs and Hallucinations for XAI, and Responsible AI Audits

Tools for Thought: Understanding, Protecting, and Augmenting Human Cognition with Generative AI - From Vision to Implementation

L 1 vs. L 2 regularization in text classification when learning from labeled features

L ₁ vs. L ₂ regularization in text classification when learning from labeled features