Abstract

In the secretary problem, a set of items are revealed to us in random order, and we want to maximize the probability of picking the best among them. In the (stochastic) multi-armed bandit problem, we perform pulls from a set of arms, each with a fixed but unknown payoff probability, and want to minimize the regret. Both these problems are well-studied in the sequential decision-making literature.

We consider a robust setting where an adversary is allowed to introduce some limited amount of corruptions. We show that classical algorithms fail badly in the presence of corruptions, and then we give algorithms that are more robust to corruptions.

This is based on joint works with Domagoj Bradac, Sahil Singla, and Goran Zuzic, with C.J. Argue and Marco Molinaro, and with Tomer Koren and Kunal Talwar.

Video Recording