Stabilizing Q-learning with Weighted Bellman Losses

Workshop

Deep Reinforcement Learning

Speaker(s)

Pieter Abbeel (UC Berkeley)

Location

Date

Wednesday, Sept. 30, 2020

Time

10 – 10:30 a.m. PT

Abstract

It is well-known that Q-learning can easily destabilize when using nonlinear function approximation. Existing work alleviates instabilities by using double Q functions or simply using the min of two Q functions. However, in the process of such stabilization there is also signal that's lost. In this work we investigate Q-Learning with Weighted Bellman Losses that reflect uncertainty estimates on the target Q's. Our experiments with SAC and Rainbow DQN show stable and faster learning. Our approach is also easily augmented with UCB exploration to further speed up learning.

Attachment

Slides

Stabilizing Q-learning with Weighted Bellman Losses

Abstract

Attachment

Video Recording