Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth StudyShawn TanSonglin Yanget al.2025ICLR 2025
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic PlanningGang LiuMichael Sunet al.2025ICLR 2025
Self-MoE: Towards Compositional Large Language Models with Self-Specialized ExpertsJunmo KangLeonid Karlinskyet al.2025ICLR 2025
Shedding Light on Time Series Classification using Interpretability Gated NetworksYunshi WenTengfei Maet al.2025ICLR 2025
Reasoning of Large Language Models over Knowledge Graphs with Super-RelationsSong WangLin Junhonget al.2025ICLR 2025