Policy Gradients in General-Sum Dynamic Games: When Do They Even Converge?

Workshop

Multi-Agent Reinforcement Learning and Bandit Learning

Speaker(s)

Eric Mazumdar (Caltech)

Location

Calvin Lab Auditorium

Date

Monday, May 2, 2022

Time

2:30 – 3 p.m. PT

Abstract

In this talk I will present work showing that agents using simple policy gradient algorithms in arguably the simplest class of continuous action- and state-space multi-agent control problem: general-sum linear quadratic games, have no guarantees of asymptotic convergence, and that proximal point and extra-gradients will not solve these issues. I will then focus in on zero-sum LQ games in which stronger convergence guarantees are possible when agents use independent policy gradients with a finite timescale separation.

Policy Gradients in General-Sum Dynamic Games: When Do They Even Converge?

Abstract

Video Recording