Competing in the dark: An efficient algorithm for bandit linear optimization

Jacob Abernethy; Elad Hazan; Alexander Rakhlin

COLT 2008

Conference paper

01 Dec 2008

Competing in the dark: An efficient algorithm for bandit linear optimization

Abstract

We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O *(√T) regret. The setting is a natural generalization of the non-stochastic multi-armed bandit problem, and the existence of an efficient optimal algorithm has been posed as an open problem in a number of recent papers. We show how the difficulties encountered by previous approaches are overcome by the use of a self-concordant potential function. Our approach presents a novel connection between online learning and interior point methods.

Paper