Towards Assurance of LLM Adversarial Robustness using Ontology-Driven ArgumentationTomas Bueno MomcilovicBeat Buesseret al.2024xAI 2024
Exploring Vulnerabilities in LLMs: A Red Teaming Approach to Evaluate Social BiasYuya Jeremy OngJay Pankaj Galaet al.2024IEEE CISOSE 2024
Navigating the Modern Evaluation Landscape: Considerations in Benchmarks and Frameworks for Large Language Models (LLMs)Leshem ChoshenAriel Geraet al.2024LREC-COLING 2024
Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?Yu-Lin TsaiChia-yi Hsuet al.2024ICLR 2024
Can LLMs Fix Issues with Reasoning Models? Towards More Likely Models for AI PlanningTurgay CaglarSirine Belhajet al.2024AAAI 2024
Human Evaluation of the Usefulness of Fine-Tuned English Translators for the Guarani Mbya and Nheengatu Indigenous LanguagesClaudio Santos PinhanezPaulo Rodrigo Cavalinet al.2024PROPOR 2024
Workshop version: How hard are computer vision datasets? Calibrating dataset difficulty to viewing timeDavid MayoJesse Cummingset al.2023NeurIPS 2023
Unveiling Safety Vulnerabilities of Large Language ModelsGeorge KourMarcel Zalmanoviciet al.2023EMNLP 2023
URET: Universal Robustness Evaluation Toolkit (for Evasion)Kevin EykholtTaesung Leeet al.2023USENIX Security 2023