On the role of noise in factorizers for disentangling distributed representationsKumudu Geethan KarunaratneMichael Herscheet al.2024NeurIPS 2024
Enhancing Reasoning to Adapt Large Language Models for Domain-Specific ApplicationsBo WenXin Zhang2024NeurIPS 2024
Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMsMegh ThakkarYash Moreet al.2024NeurIPS 2024
Value Alignment from Unstructured TextInkit PadhiKarthikeyan Natesan Ramamurthyet al.2024NeurIPS 2024
SocialStigmaQA Spanish and Japanese - Towards Multicultural Adaptation of Social Bias BenchmarksClara Higuera CabañesRyo Iwakiet al.2024NeurIPS 2024
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational AgentsIvoline NgongSwanand Ravindra Kadheet al.2024NeurIPS 2024
Consistency-based Black-box Uncertainty Quantification for Text-to-SQLDebarun BhattacharjyaBalaji Ganesanet al.2024NeurIPS 2024
Data Contamination Report from the 2024 CONDA Shared TaskOscar SainzIker García-ferreroet al.2024ACL 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMsSwanand Ravindra KadheFarhan Ahmedet al.2024ICML 2024