Paper

Conditional Monge Gap enables generalizable single-cell perturbation modelling

Abstract

Learning the response of single cells to various treatments offers great potential to enable targeted therapies. In this context, neural optimal transport has emerged as a principled methodological framework because it inherently accommodates the challenges of unpaired data induced by cell destruction during data acquisition. However, most existing optimal transport approaches are incapable of conditioning on different treatment contexts (for example, time, drug treatment, drug dose or cell type), and we still lack methods that unanimously show promising generalizability to unseen treatments. Here we propose the Conditional Monge Gap (CMonge), which learns optimal transport maps conditionally on arbitrary covariates. We demonstrate its value in predicting single-cell perturbation responses conditional to one or more drugs, drug dose or combinations thereof. We found that our conditional models achieve results comparable with and sometimes even superior to the condition-specific state-of-the-art single-cell RNA sequencing as well as multiplexed protein imaging data. Notably, by scaling to hundreds of conditions and training on hundreds of millions of drugs, we enable cross-task learning and unlock generalizability to unseen drugs. Our method widely outperforms other conditional models in capturing heterogeneity in cell populations. In short, CMonge is mathematically grounded, highly parameter-efficient relative to single-cell foundation models and yields accurate predictions for unseen drugs using only the compound structure. Thus, it opens a practical route for accelerating drug discovery and repurposing.