Pure Exploration Problems

Workshop

Mathematics of Online Decision Making

Speaker(s)

Wouter Koolen (Centrum Wiskunde & Informatica)

Location

Date

Monday, Oct. 26, 2020

Time

11:30 a.m. – 12 p.m. PT

Abstract

In this talk we will look at Pure Exploration tasks in the Multi-Armed Bandit setting. We will review the basic Best Arm Identification problem, and present the Game Tree Search problem. We will start from lower bounds, and this will motivate the Track-and-Stop family of asymptotically instance-optimal algorithms. We will then look at structured bandit settings and problems with multiple correct answers. We will build efficient algorithms using saddle point solvers. We will finally return to the Game Tree Search problem, and discuss the connections with reinforcement learning.

Attachment

Slides

Pure Exploration Problems

Abstract

Attachment

Video Recording