Compositional Physical Reasoning of Objects and Events From VideosZhenfang ChenShilong Donget al.2025IEEE Transactions on Pattern Analysis and Machine Intelligence
Visual Dependency Transformers: Dependency Tree Emerges from Reversed AttentionMingyu DingYikang Shenet al.2023CVPR 2023
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task LearnersZitian ChenYikang Shenet al.2023CVPR 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction FollowingMingyu DingYan Xuet al.2022CORL 2022