Navigating the Modern Evaluation Landscape: Considerations in Benchmarks and Frameworks for Large Language Models (LLMs)Leshem ChoshenAriel Geraet al.2024LREC-COLING 2024
Facilitating Human-LLM Collaboration through Factuality Scores and Source AttributionsHyo Jin DoRachel Ostrandet al.2024CHI 2024
Granite code models: A family of open foundation models for code intelligenceMayank MishraMatthew Stalloneet al.2024arXiv
Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?Yu-Lin TsaiChia-yi Hsuet al.2024ICLR 2024
Time-LLM: Time Series Forecasting by Reprogramming Large Language ModelsMing JinShiyu Wanget al.2024ICLR 2024
Reasoning of Large Language Models over Knowledge Graphs with Super-RelationsSong WangLin Junhonget al.2024ICLR 2024
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Language ModelsYan LiuYu Liuet al.2024ICLR 2024
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech RecognitionChen ChenRuizhe Liet al.2024ICLR 2024
Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChainMarcus MinRobin (Yangruibo) Dinget al.2024ICLR 2024