The Mean-Squared Error of Double Q-Learning

Workshop

Reinforcement Learning from Batch Data and Simulation

Speaker(s)

R. Srikant (University of Illinois at Urbana-Champaign)

Location

Date

Wednesday, Dec. 2, 2020

Time

10 – 10:30 a.m. PT

Abstract

We establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Using prior work on the asymptotic mean-squared error of linear stochastic approximation based on Lyapunov equations, we show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.

Attachment

Slides

The Mean-Squared Error of Double Q-Learning

Abstract

Attachment

Video Recording