Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms

Parent Program

Real-Time Decision Making

Location

Calvin Lab auditorium

Speaker(s)

Sean Meyn, University of Florida

Date

Friday, Mar. 9, 2018

Time

10 – 11:30 a.m. PT

Back to calendar

Description

Stochastic Approximation algorithms are used to approximate solutions to fixed point equations that involve expectations of functions with respect to possibly unknown distributions. The most famous examples today are TD- and Q-learning algorithms. This three hour tutorial lecture series will consist of two parts:

The basics: an overview of stochastic approximation, with a focus on optimizing the rate of convergence. An algorithm that gives the best rate is analogous to the Newton-Raphson algorithm.
Left-overs and review from part 1, and applications to reinforcement learning for discounted-cost optimal control, and optimal stopping problems. Theory from Part 1 leads to the new Zap Q-learning algorithm. Analysis suggests that its transient behavior is a close match to a deterministic Newton-Raphson implementation, and numerical experiments confirm super fast convergence.

The first session of this mini course will take place on Wednesday, March 7, 2:30 – 4:00 pm; the second session will take place on Friday, March 9, 10:00 – 11:30 am. All talks will be recorded.

Video

Documents

Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms

All scheduled dates:

Upcoming

No Upcoming activities yet

Reinforcement Learning: Hidden Theory and New Super-Fast Algorithms

All scheduled dates:

Upcoming

Past