Abstract

A key challenge in deep reinforcement learning systems is collecting data in the loop of learning. Since most algorithms learn from scratch, they require a large number of samples to be collected during trial-and-error learning, especially in problems that demand sophisticated exploration. Meta-reinforcement learning methods aim to address this challenge by leveraging prior experience with other related tasks, explicitly optimizing for transferable exploration strategies and efficient learning rules. In this talk, I'll overview how these prior algorithms learn exploration strategies and identify a critical issue that the predominant approach faces, a chicken-and-egg optimization issue. I'll further discuss how we can overcome this challenge, in a way that is consistent with the original optimization but substantially more efficient. Finally, I'll discuss some open directions within the realm of meta-RL.

Attachment

Video Recording