Description

Title: Bandit Algorithms for Prophet Inequality and Pandora's Box

Abstract: The Prophet Inequality and Pandora's Box problems are fundamental stochastic problems. A usual assumption for both problems is that the probability distributions of the underlying random variables are given as input to the algorithm. In this talk, we assume the distributions are unknown, and study them in the  Multi-Armed Bandits model: We interact with the unknown distributions over multiple rounds. In each round we play a policy and receive only bandit feedback. The goal is to minimize the regret, which is the difference in the total value of the optimal algorithm that knows the  distributions vs. the total value of our algorithm that learns the distributions from the bandit feedback. Our main results give near-optimal total regret algorithms for both Prophet Inequality and Pandora's Box. This is joint work with Khashayar Gatmiry, Thomas Kesselheim, and Sahil Singla.

All scheduled dates:

Upcoming

No Upcoming activities yet