Abstract

A key question is how to better leverage the data generated (from RL agents or humans) to get better control policies. We would like to be able to combine the power of extremely expressive function approximators like deep learning with rigorous statistical guarantees to ensure data efficiency and strong guarantees on resulting performance. In this talk I will outline some of our recent work in this space and potential future directions.

Video Recording