EvalAssist: Insights on Task-Specific Evaluations and AI-assisted Judgement Strategy PreferencesZahra AshktorabMichael Desmondet al.2025UIST 2025
StructText: A Synthetic Table-to-Text Approach for Benchmark Generation with Multi-Dimensional EvaluationSatyananda KashyapSola Shiraiet al.2025VLDB 2025
SKIP-SALSA: Skip Synchronous Fusion of ASR LLM DecodersAshish MittalDarshan Prabhuet al.2025INTERSPEECH 2025
Spectra to Molecule: A Multimodal Multitasking Transformer Model for Automated Structure ElucidationMarvin AlbertsTeodoro Laino2025ACS Fall 2025
Transformer Model for Structure Elucidation from Tandem Mass Spectroscopy dataLaura MismettiMarvin Albertset al.2025ACS Fall 2025
Evaluating LLM-based Agents: Foundations, Best Practices and Open ChallengesRoy Bar-HaimArman Cohanet al.2025IJCAI 2025
Multi-Sense Embeddings for Language Models and Knowledge DistillationQitong WangMohammed Zakiet al.2025ACL 2025
Query-driven Document-level Scientific Evidence Extraction from Biomedical StudiesMassimiliano PronestiJoao Bettencourt-Silvaet al.2025ACL 2025
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsYannis KatsisSara Rosenthalet al.2025ACL 2025