Abstract

Modern reinforcement learning algorithms use function approximation to deal with large state space. In this talk, I will present both positive and negative results on provably efficient RL algorithms with function approximation. In the first part of the talk, I will present a provably efficient algorithm to deal with environments with a latent state structure. In the second part of the talk, I will present some negative results regarding model-free RL algorithms. I will present exponential lower bounds for value-based and policy-based learning with function approximation.