Abstract

We consider the "Best Set" combinatorial pure exploration problem. In this problem, we are given n arms with unknown reward distributions, as well as a family F of feasible subsets over the arms. Our goal is to identify the feasible subset in F with the maximum total mean using as few samples as possible. The problem generalizes the classical top-k arm identification problems.

We discuss an instance-wise lower bound for the sample complexity of the problem, as well as a sampling algorithm that matches the lower bound up to a factor of ln|F| (which we show is existentially tight). We also discuss poltnomial-time implementations of these sampling algorithms, and other issues.

Joint work with Lijie Chen, Jian Li, Mingda Qiao, and Ruosong Wang.

Video Recording