![Theory of Reinforcement Learning_hi-res logo](/sites/default/files/styles/workshop_banner_sm_1x/public/2023-05/Theory%20of%20Reinforcement%20Learning_hi-res.png.jpg?itok=LJoCeC_d)
Abstract
We establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Using prior work on the asymptotic mean-squared error of linear stochastic approximation based on Lyapunov equations, we show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.