When and Why does a Model Fail? A Human-in-the-loop Error Detection Framework for Sentiment AnalysisZhe LiuYufan Guoet al.2021NAACL 2021
Benchmarking Commercial Intent Detection Services with Practice-Driven EvaluationsHaode QiLin Panet al.2021NAACL 2021
Does Structure Matter? Encoding Documents for Machine Reading ComprehensionHui WanSong Fenget al.2021NAACL 2021
Project Debater - from grand challenge to business applications, behind the scenes and lessons learnedAya Soffer2021NAACL 2021
Data Cleaning Tools for Token Classification TasksKarthik MuthuramanFrederick Reisset al.2021NAACL 2021
Argument Mining for Scholarly Document Processing: Taking Stock and Looking AheadKhalid Al KhatibTirthankar Ghosalet al.2021NAACL 2021
IBMResearch at MEDIQA 2021: Toward Improving Factual Correctness of Radiology Report Abstractive SummarizationDiwakar MahajanChing-Huei Tsouet al.2021NAACL 2021
emrKBQA: A Clinical Knowledge-Base Question Answering DatasetPreethi RaghavanDiwakar Mahajanet al.2021NAACL 2021
A Universal Dependencies Corpora Maintenance Methodology Using Downstream ApplicationRan IwamotoHiroshi Kanayamaet al.2021NAACL 2021