Recent advances in average-reward restless bandits

Workshop

Data-Driven Decision Processes Reunion

Speaker(s)

Weina Wang (Carnegie Mellon University)

Location

Calvin Lab Auditorium

Date

Friday, Jan. 5, 2024

Time

10:15 – 11 a.m. PT

Abstract

We consider the infinite-horizon restless bandit problem with the average reward criterion. A fundamental question is how to design computationally efficient policies that achieve a diminishing optimality gap as the number of arms, N, grows large. Existing results on asymptotical optimality all rely on the uniform global attractor property (UGAP), a complex and challenging-to-verify assumption. In this talk, we discuss how this assumption can be removed. We present various new algorithms that unveil more insights on how to achieve asymptotic optimality.