Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language ModelsGeorge KourItay Nakashet al.2025ACL 2025
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You InItay NakashGeorge Kouret al.2025NAACL 2025
Exploring Straightforward Methods for Automatic Conversational Red-TeamingGeorge KourNaama Zwerdlinget al.2025NAACL 2025
A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial ScenariosSamuel AckermanElla Rabinovichet al.2024EMNLP 2024
Predicting Question-Answering Performance of Large Language Models through Semantic ConsistencyElla RabinovichSamuel Ackermanet al.2023EMNLP 2023
Unveiling Safety Vulnerabilities of Large Language ModelsGeorge KourMarcel Zalmanoviciet al.2023EMNLP 2023
Text Augmentation Using Dataset Reconstruction for Low-Resource ClassificationAdir RahamimGuy Uzielet al.2023ACL 2023
Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text CorporaGeorge KourSamuel Ackermanet al.2022EMNLP 2022