Abstract

In recent years there has been significant interest in establishing fast convergence rates for learning in games: in particular, when each agent in a general-sum game plays according to a no-regret learning algorithm, their regret can often be bounded by a quantity smaller than the sqrt(T) worst-case guarantee. I will discuss our result that in finite normal-form games, when each agent uses Optimistic Hedge to choose their strategies, their regret is bounded by poly(log T). This improves upon previous works of Syrgkanis et al. (2015), who showed a bound of T^{1/4}, and Chen & Peng (2020) who improved it to T^{1/6}. A corollary of our result is that Optimistic Hedge converges to coarse correlated equilibrium at a rate of O(1/T). I will proceed to discuss a few subsequent works: first, we have shown the poly(log T) bound can be extended to hold for internal (and swap) regret, thus establishing convergence to correlated equilibrium at a rate of O(1/T). Second, while the preceding regret bounds apply only to finite games and depend on the size of the game, we are able to show a fast rate of T^{1/4} for infinite games where the dependence is on the Littlestone dimension of the game. Based on joint works with Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, and Tuomas Sandholm.

Video Recording