J. Tersoff
Applied Surface Science
We present a novel approach to chemical foundation models, leveraging structured state space sequence models (SSMs) to overcome the limitations of traditional Transformer-based architectures. While Transformers have achieved state-of-the-art results in chemical tasks such as property prediction and molecule generation, their self-attention mechanism is constrained by its inability to model data outside of a finite context window and its quadratic scaling with respect to window length. In contrast, SSMs offer a promising alternative for sequence modeling, enabling the capture of complex patterns and dependencies in molecular structures. Our Mamba architecture, a simplified end-to-end SSM-based neural network, eliminates the need for attention and MLP blocks, allowing for faster inference. We pre-train Mamba on a large, curated dataset of 91 million SMILES samples (equivalent to 4 billion molecular tokens) sourced from PubChem, and evaluate its performance on various benchmark datasets. Our experiments demonstrate the SSM's capacity to provide state-of-the-art results while maintaining fast inference, supporting complex tasks such as molecular property prediction, classification, molecular reconstruction, and synthesis yield prediction. This work advances the state-of-the-art in AI methodology in chemical sciences, offering a promising direction for future research in molecular modeling and discovery.
J. Tersoff
Applied Surface Science
Arvind Kumar, Jeffrey J. Welser, et al.
MRS Spring 2000
B.K. Furman, H.M. Clearfield, et al.
Journal of Vacuum Science and Technology A: Vacuum, Surfaces and Films
Jerng-Sik Song, Chin-An Chang
Journal of Vacuum Science and Technology A: Vacuum, Surfaces and Films