Conference paper

Controllable Molecular Generation via Sparse Autoencoders and Feature-Guided Latent Manipulation

Abstract

Designing novel molecules with precise control over physicochemical or structural features remains a central challenge in molecular discovery, particularly in areas such as drug design, materials science, and chemical biology. Existing generative models often lack interpretability and fine-grained controllability, limiting their practical utility in guided molecule design.

To address this limitation, we propose a novel framework for controllable molecular generation that leverages sparse representation learning in the latent space. Our method begins with molecular embeddings obtained from one of the existing pretrained molecular foundation models, which encode structural information. These embeddings are used to train a Sparse Autoencoder (SAE), a type of neural network that compresses input representations into a sparse, lower-dimensional latent space and then reconstructs the original input from these latent codes. SAE has recently attracted attention in research on interpreting large language models (LLMs), owing to its sparse and interpretable latent space. The sparsity constraint promotes a higher-dimensional latent space in which each dimension represents simplified information, making it easier to associate with chemically meaningful features and allowing for interpretable, targeted manipulation.

To uncover how specific latent dimensions relate to molecular properties of interest—such as lipophilicity, polarity, or the presence of functional groups—we introduce an interpretability layer through statistical modeling techniques. By analyzing the relationship between the sparse latent vectors and predefined molecular features, we identify directions in the latent space that influence these properties. This interpretability allows users to steer the generative process toward molecules exhibiting desired characteristics, without requiring direct supervision from end-to-end property prediction models.

Given a seed molecule, we alter its latent vector along selected interpretable directions in the sparse space. The modified latent code is first decoded through the trained SAE and then passed to the pretrained molecular decoder to reconstruct a new SMILES representation. This approach supports directional and quantitative control over multiple molecular features simultaneously—for instance, increasing logP while reducing aromatic ring count, all while preserving chemical validity.

We validate our method on the PubChem-ZINC dataset and demonstrate that the generated molecules consistently exhibit the intended changes in targeted properties. Importantly, our method generates molecules that are chemically valid, synthetically plausible, and structurally consistent with the original chemical space. The proposed framework offers a powerful tool for hypothesis-driven molecular design, inverse property optimization, and interpretable latent space exploration in both medicinal and materials chemistry.