FlexAttention for Efficient High-Resolution Vision-Language ModelsJunyan LiDelin Chenet al.2024ECCV 2024Conference paper
Learning Active Camera for Multi-Object NavigationPeihao ChenDongyu Jiet al.2022NeurIPS 2022Conference paper
Foley Music: Learning to Generate Music from VideosChuang GanDeng Huanget al.2020ECCV 2020Conference paper