Teaching VLMs to Localize Specific Objects from In-context ExamplesSivan DovehNimrod Shabtayet al.2025ICCV 2025
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers contentNimrod ShabtayFelipe Maia Poloet al.2025ICLR 2025