Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Deep learning models have shown great potential in predicting molecular properties. These models need to learn latent representations that capture the intrinsic geometry of the molecules; they should preserve symmetries. To address this problem, we propose a strategy for pre-training such models using 2D molecular graphs that exploits a topological invariant, based on simplicial homology. This invariant is computed as a node-level feature, capturing both local and global structure of the graph; more specifically, given a graph G and a node v, consider G-v, the largest subgraph of G not containing v, and we compute its Betti numbers beta0 (G-v) and beta1 (G-v). Essentially, we are removing a node and counting connected components and cycles that remain in the graph. We first pre-train a graph-aware transformer model using this objective function to learn the underlying structural features of molecules. We then fine-tune the model on the target molecular property prediction task. We evaluate our approach on several benchmark datasets and show that our pre-training strategy consistently improves the performance of the model compared to other pre-training methods. Our results demonstrate the effectiveness of incorporating topology information in pre-training molecular property prediction models and highlight the potential of our approach in advancing the field.
Robert Farrell, Rajarshi Das, et al.
AAAI-SS 2010
Max Rossmannek, Fabijan Pavošević, et al.
ACS Fall 2023
Vidushi Sharma, Maxwell Giammona, et al.
ACS Fall 2023
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025