Dong Ki Kim, Miao Liu, et al.
ICML 2021
It has been a challenge to learning skills for an agent from long-horizon unannotated demonstrations. Existing approaches like Hierarchical Imitation Learning(HIL) are prone to compounding errors or immature solutions. In this paper, we propose Option-GAIL, a novel method to learn skills at a long horizon.The key idea of Option-GAIL is modeling the task hierarchy by options and train the policy via generative adversarial optimization. In particular, we propose an Expectation-Maximization(EM)-style algorithm: an E-step that samples the options of expert conditioned on the current learned policy, and an M-step that updates the low- and high-level policies of agent simultaneously to minimize the newly proposed option-occupancy measurement between expert and agent. We theoretically prove the convergence of the proposed algorithm. Experiments show that our Option-GAIL outperforms other counterparts consistently across a variety of tasks.
Dong Ki Kim, Miao Liu, et al.
ICML 2021
Vanya BK, Balaji Ganesan, et al.
ICML 2021
Vasileios Kalantzis, Georgios Kollias, et al.
ICML 2021
Horst Samulowitz, Parikshit Ram, et al.
ICML 2021