An Analysis of Hyper-Parameter Optimization Methods for Retrieval Augmented GenerationMatan OrbachOhad Eytanet al.2026AAAI 2026Workshop paper
Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark EvaluationYotam PerlitzAriel Geraet al.2025NeurIPS 2025Workshop paper
Debatable Intelligence: Benchmarking LLM Judges via Debate Speech EvaluationNoy SternlichtAriel Geraet al.2025EMNLP 2025Conference paper
Label-Efficient Model Selection for Text GenerationShir Ashury TahanAriel Geraet al.2024ACL 2024Conference paper
Navigating the Modern Evaluation Landscape: Considerations in Benchmarks and Frameworks for Large Language Models (LLMs)Leshem ChoshenAriel Geraet al.2024LREC-COLING 2024Tutorial
Active Learning for Natural Language GenerationYotam PerlitzAriel Geraet al.2023EMNLP 2023Conference paper
Label Sleuth: From Unlabeled Text to a Classifier in a Few HoursEyal ShnarchAlon Halfonet al.2022EMNLP 2022Demo paper