Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss LandscapesXiaomeng XuPin-Yu Chenet al.2024NeurIPS 2024
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language ModelsShengyun PengPin-Yu Chenet al.2024NeurIPS 2024
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level GuaranteesRyan ZhangHerbert Woisetschlägeret al.2024NeurIPS 2024
Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAIAmbrish RawatStefan Schoepfet al.2024NeurIPS 2024
One Tree to Rule Them All: Optimizing GGM Trees and OWFs for Post-Quantum SignaturesCarsten BaumWard Beullenset al.2024AsiaCrypt 2024
SQIsign2D-West: The Fast, the Small, and the SaferAndrea BassoPierrick Dartoiset al.2024AsiaCrypt 2024
Compute, but Verify: Efficient Multiparty Computation over Authenticated InputsMoumita DuttaChaya Ganeshet al.2024AsiaCrypt 2024
A Policy Framework for Securing Cloud APIs by Combining Application Context with Generative AIShriti PriyaJulian James Stephen2024ACSAC 2024