Fall 2020

Fellows Talk - Galyna Livshyts and Lin Yang

Sep. 10, 2020 1:00 pm2:00 pm

Add to Calendar


Galyna Livshyts (Georgia Institute of Technology) & Lin Yang (UCLA)


Zoom link will be sent out to program participants.

Galyna Livshyts (Georgia Institute of Technology)

Title: On the smallest singular value of inhomogeneous random matrices
Abstract: In random matrix theory, many estimates raise questions of universality: how general could assumptions on the matrix be, in order for certain phenomenon to take place. I shall discuss the tight small ball estimates for the smallest singular value of a wide class of random matrices, some of which are joint with Tikhomirov and Vershynin. I will briefly outline the key tool — a discretization procedure.

Lin Yang (UCLA)

Title: Provably Efficient Reinforcement Learning with General Value Function Approximation
Abstract: Value function approximation has demonstrated phenomenal empirical success in reinforcement learning (RL). Nevertheless, despite a handful of recent progress on developing theory for RL with linear function approximation, the understanding of general function approximation schemes largely remains missing. In this talk, we introduce a provably efficient RL algorithm with general value function approximation. We show that if the value functions admit an approximation with a function class F, our algorithm achieves a regret bound of O(poly(dH) sqrt(T)), where d is a complexity measure of F that depends on the eluder dimension [Russo and Van Roy, 2013] and log-covering numbers, H is the planning horizon, and T is the number interactions with the environment. Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment. Moreover, our algorithm is model-free and provides a framework to justify the effectiveness of algorithms used in practice. This talk will be focusing on more technical details of the proof. A more general and introductive talk about the paper is given by my coauthor in the RL theory seminar: