Learning Automata as Building Blocks for MARL

Workshop

Multi-Agent Reinforcement Learning and Bandit Learning

Speaker(s)

Ann Nowe (Vrije Universiteit Brussel)

Location

Calvin Lab Auditorium

Date

Tuesday, May 3, 2022

Time

2 – 2:30 p.m. PT

Abstract

In this talk I will show that Learning Automata (LA), and more precisely Reward in Action update schemes are interesting building blocks for Multi-agent RL, both in bandit settings as well as stateful RL. Based on the theorem of Narendra and Wheeler we have convergence guarantees in n-person non-zero sum games. However, LA have also shown to be robust in more relaxed settings, such as queueing systems, where updates happen asynchronously and the feedback sent to the agents is delayed.

Learning Automata as Building Blocks for MARL

Abstract

Video Recording