Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization

Yihang Yao; Zhepeng Cen; Haohong Lin; Zuxin Liu; Jiacheng Zhu; Laixi Shi; Zhang-Wei Hong; Ding Zhao

ICML 2026

Conference paper

06 Jul 2026

Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization

Abstract

Proactive large language model (LLM) agents aim to actively plan, query, and interact over mul- tiple turns, enabling efficient task completion be- yond passive instruction following and making them essential for real-world, user-centric appli- cations. Agentic reinforcement learning (RL) has recently emerged as a promising solution for train- ing such agents in multi-turn settings, allowing interaction strategies to be learned from feedback. However, existing pipelines face a critical chal- lenge in balancing task performance with user engagement, as passive agents can not efficiently adapt to users’ intentions while overuse of human feedback reduces their satisfaction. To address this trade-off, we propose BAO, an agentic RL framework that combines behavior enhancement to enrich proactive reasoning and information- gathering capabilities with behavior regulariza- tion to suppress inefficient or redundant interac- tions and align agent behavior with user expecta- tions. We evaluate BAO on multiple tasks from the UserRL benchmark suite, and demonstrate that it substantially outperforms proactive agentic RL baselines while achieving comparable or even superior performance to commercial LLM agents, highlighting its effectiveness for training proac- tive, user-aligned LLM agents in complex multi- turn scenarios. Our website: https://proactive- agentic-rl.github.io/.

Workshop