Yanbo Wang, Zixiang Xu, et al.
NeurIPS 2025
Designing stable crystal structures is central to accelerating the discovery of new materials, yet most generative approaches remain limited to reproducing known patterns rather than exploring novel possibilities. We present a method that trains large language models with reinforcement learning guided by verifiable energy-based rewards, optimizing toward physically grounded stability objectives. Compared to supervised finetuning and base models, our reinforcement learning–trained model generates crystals with higher predicted stability and a greater proportion of previously unreported structures. These results suggest that combining verifiable energy rewards and reinforcement learning provides a powerful path toward automated discovery of novel, stable materials.
Yanbo Wang, Zixiang Xu, et al.
NeurIPS 2025
Eduardo Almeida Soares, Victor Yukio Shirasuna, et al.
NeurIPS 2025
James Hedrick
ACS Fall 2023
Leonardo Guerreiro Azevedo, Julio Cesar Cardoso Tesolin, et al.
ACS Fall 2023