Amadou Ba, Fearghal O'Donncha, et al.
INFORMS 2023
There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) as heuristic-based methods do not guarantee optimality and find it challenging to incorporate practical operational constraints. Existing MIP methods rely on an arc-based binary-tree formulations that are limited to datasets having a few thousand samples, sample-level constraints, and linear metrics. We propose a novel path-based MIP model that can be solved using a scalable column generation framework to produce a multiway-split tree that is more interpretable due to its shorter rules. Our method can optimize nonlinear metrics like F1 score and satisfy a broader class of constraints. We demonstrate its efficacy with extensive experiments including a million-sample dataset and report up to a 24X reduction in runtime compared to the state-of-art MIP-based methods.
Amadou Ba, Fearghal O'Donncha, et al.
INFORMS 2023
Gosia Lazuka, Andreea Simona Anghel, et al.
SC 2024
Yidi Wu, Thomas Bohnstingl, et al.
ICML 2025
Ben Fei, Jinbai Liu
IEEE Transactions on Neural Networks