Irene Ko, Sihui Dai, et al.
NeurIPS 2024
In this paper, we propose a system designed to process and interpret vague, open-ended, and multi-line complex natural language queries, transforming them into coherent, actionable data stories. Our system’s modular architecture comprises five components—Question Generation, Answer Generation, NLG/Chart Generation, Chart2Text, and Story Representation—each utilizing LLMs to transform data into human-readable narratives and visualizations. Unlike existing tools, our system uniquely addresses the ambiguity of vague, multi-line queries, setting a new benchmark in data storytelling by tackling complexities no existing system comprehensively handles. Our system is cost-effective, which uses open-source models without extra training and emphasizes transparency by showcasing end-to-end processing and intermediate outputs. This enhances explainability, builds user trust, and clarifies the data story generation process.
Irene Ko, Sihui Dai, et al.
NeurIPS 2024
Nafis Neehal, Bowen Wang, et al.
NAACL 2025
Manish Nagireddy, Michael Feffer, et al.
NAACL 2025
Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, et al.
NeurIPS 2023