NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API CallsKinjal BasuIbrahim Abdelazizet al.2025EMNLP 2025
Declarative Techniques for NL Queries over Heterogeneous DataElham KhabiriJeff Kephartet al.2025EMNLP 2025
Exploring the Limits of Conformer CTC-Encoder for Speech Emotion Recognition using Large Language ModelsEdmilson Da Silva MoraisHagai Aronowitzet al.2025INTERSPEECH 2025
The Impact of Domain Adaptation on the Activation Space of LLMsAssala BenmalekCelia Cintaset al.2025DLI 2025
Combining Domain and Alignment Vectors Provides Better Knowledge-Safety Trade-offs in LLMsMegh ThakkarQuentin Fournieret al.2025ACL 2025
R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic MemoryTenghao HuangKinjal Basuet al.2025ACL 2025
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsYannis KatsisSara Rosenthalet al.2025ACL 2025
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational AgentsIvoline NgongSwanand Ravindra Kadheet al.2025ACL 2025
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the CommunityShachar Don-YehiyaLeshem Choshenet al.2025ACL 2025
BioVERSE: A Modular Framework for Integrating Biomedical Modalities with Language Models in Precision MedicineChing-Huei TsouMichal Ozery-Flatoet al.2025ISMB 2025