Epigraph Based Multilevel Optimization (EMO) for Enhancing Chain-of-Thought Reasoning Capabilities
Abstract
Chain-of-thought (CoT) reasoning applies to complex tasks with multiple intermediate steps, a key feature of large language models. Recent studies have revealed CoT as a composition of in-context filtering and learning. This paper proposes a unified framework for CoT optimization that exploits the nested problem structure to formulate training as multilevel optimization. Each intermediate reasoning step is a distinct optimization level. We develop an epigraph-based multilevel optimization (EMO) method to iteratively find the optimal solution for this class of problems. Experiments using GPT-2 show that the proposed EMO achieves the lowest generalization errors across all intermediate steps compared to state-of-the-art, highlighting the importance of nested optimization approaches for CoT reasoning.