Abstract

I will give an overview of my previous works, which center around algorithm design for online learning, and goal is to have a regret adaptive to the problem instance, without prior knowledge about the problem instance. Some examples will be given in the full-information expert problem, bandits, and Markov decision processes. I will conclude with some future directions I am interested in.

Video Recording