Representation Learning and Exploration in Reinforcement Learning

Workshop

Mathematics of Online Decision Making

Speaker(s)

Akshay Krishnamurthy (Microsoft Research)

Location

Date

Friday, Oct. 30, 2020

Time

9 – 9:30 a.m. PT

Abstract

I will discuss new provably efficient algorithms for reinforcement in rich observation environments with arbitrarily large state spaces. Both algorithms operate by learning succinct representations of the environment, which they use in an exploration module to acquire new information. The first algorithm, called Homer, operates in a block MDP model and uses a contrastive learning objective to learn the representation. On the other hand, the second algorithm, called FLAMBE, operates in a much richer class of low rank MDPs and is model based. Both algorithms accommodate nonlinear function approximation and enjoy provable sample and computational efficiency guarantees.

Attachment

Slides

Representation Learning and Exploration in Reinforcement Learning

Abstract

Attachment

Video Recording