Theory of Reinforcement Learning

About

This program aims to advance the theoretical foundations of reinforcement learning (RL) and foster new collaborations between researchers across RL and computer science.

Recent years have seen a surge of interest in reinforcement learning, fueled by exciting new applications of RL techniques to various problems in artificial intelligence, robotics, and natural sciences. Many of these advances were made possible by a combination of large-scale computation, innovative use of flexible neural network architectures and training methods, and new and classical RL algorithms. However, we lack a solid understanding of when, why, and to what extent these algorithms work.

Reinforcement learning's core issues, such as efficiency of exploration and the trade-off between the scale and the difficulty of learning and planning, have received concerted study over the last few decades within many disciplines and communities, including computer science, numerical analysis, artificial intelligence, control theory, operations research, and statistics. The result has been a solid body of work that has built and resolved some of the core problems; yet, the most pressing problems, concerning how one can design highly scalable algorithms, still remain open.

This program aims to reunite researchers across disciplines that have played a role in developing the theory of reinforcement learning. It will review past developments and identify promising directions of research, with an emphasis on addressing existing open problems, ranging from the design of efficient, scalable algorithms for exploration to how to control learning and planning. It also aims to deepen the understanding of model-free vs. model-based learning and control, and the design of efficient methods to exploit structure and adapt to easier environments.

Organizers

Csaba Szepesvari

(University of Alberta; chair)

Emma Brunskill

(Stanford University)

Sebastien Bubeck

(Open AI)

Alan Malek

(DeepMind)

Sean Meyn

(University of Florida)

Ambuj Tewari

(University of Michigan)

Mengdi Wang

(Princeton University)

Long-Term Participants (including Organizers)

Csaba Szepesvari

(University of Alberta; chair)

Emma Brunskill

(Stanford University)

Sebastien Bubeck

(Open AI)

Sean Meyn

(University of Florida)

Ambuj Tewari

(University of Michigan)

Mengdi Wang

(Princeton University)

Yasin Abbasi-Yadkori

(DeepMind)

Pieter Abbeel

(UC Berkeley)

Rediet Abebe

(University of California Berkeley)

David Abel

(DeepMind)

Raman Arora

(Johns Hopkins University)

Yu Bai

(Salesforce Research)

Peter Bartlett

(Simons Institute, UC Berkeley)

Vivek Shripad Borkar

(Indian Institute of Technology Bombay)

Ciara Pike-Burke

(Imperial College London)

Ana Bušić

(INRIA and École Normale Supérieure Paris)

Marco Campi

(University of Brescia)

Rene Carmona

(Princeton University)

(UC Berkeley )

(JP Morgan)

(Google Research)

(University of California, Los Angeles)

Anupam Gupta

(Carnegie Mellon University)

(DeepMind)

(UC Berkeley)

(ETH Zürich)

(University of Southern California)

Nan Jiang

(University of Illinois at Urbana-Champaign)

Chi Jin

(Princeton University)

Michael Jordan

(UC Berkeley)

Mihailo Jovanovic

(University of Southern California)

(Yale University)

(CWI Amsterdam)

(Microsoft Research)

(Princeton University)

Sergey Levine

(UC Berkeley)

Lihong Li

(Google Brain)

Tengyu Ma

(Facebook AI Research)

Siva Theja Maguluri

(Georgia Institute of Technology)

Eric Moulines

(Ecole Polytechnique)

Seffi Naor

(Technion Israel Institute of Technology)

Angelia Nedich

(Arizona State University)

Gergely Neu

(UPF)

Erol Peköz

(Boston University)

Marek Petrik

(University of New Hampshire)

Benjamin Recht

(UC Berkeley)

Daniel Russo

(Columbia University)

(UC San Diego)

(UC Berkeley)

(INRIA)

(University of Alberta)

Aaron Sidford

(Stanford University)

Matus Telgarsky

(New York University)

Claire Tomlin

(UC Berkeley)

Mathukumalli Vidyasagar

(IIT Hyderabad)

Stefan Wager

(Stanford Graduate School of Business)

Martin Wainwright

(MIT)

Huizhen Yu

(University of Alberta)

Research Fellows

Jalaj Bhandari

(Columbia University)

Lin Chen

(Yale University)

Vidya Muthukumar

(Georgia Institute of Technology)

Mohamad Kazem Shirani Faradonbeh

(University of Florida)

Zhaoran Wang

(Northwestern University)

Lin Yang

(UCLA; Facebook/Novi Research Fellow)

Zhuoran Yang

(Princeton University; VMware Research Fellow)

Christina Yu

(Cornell University)

Visiting Graduate Students and Postdocs

Kumar Krishna Agrawal

(UC Berkeley)

Philip Amortila

(UC Berkeley)

Gabor Balazs

Kush Bhatia

(UC Berkeley)

Shantanu Prasad Burnwal

(IIT Hyderabad)

Michael Chang

(UC Berkeley)

Niladri Chatterji

(UC Berkeley)

Jinglin Chen

(University of Illinois at Urbana-Champaign)

Zixiang Chen

(UCLA)

Daniela Cialfi

(University of CHIETI-PESCARA )

Xiaowu Dai

(UC Berkeley)

Gokce Dayanikli

(Princeton University )

Dylan Foster

(Microsoft Research)

Germano Gabbianelli

(Universitat Pompeu Fabra)

Botao Hao

(Princeton University)

Jiafan He

(UCLA)

Haque Ishfaq

(McGill University)

Yujia Jin

(Stanford University)

Pritish Kamath

(Toyota Technological Institute at Chicago)

Seri Khoury

(UC Berkeley)

Michael Kim

(UC Berkeley )

Michael Konobeev

(University of Alberta)

Kshitij Kulkarni

(UC Berkeley)

Gene Li

(Toyota Technological Institute at Chicago)

Junchi Li

(UC Berkeley)

Yao Liu

(Stanford)

Jincheng Mei

(University of Alberta)

Aditya Modi

(University of Michigan, Ann Arbor)

Aldo Pacchiano

(UC Berkeley)

Juan Perdomo

(UC Berkeley)

Sudeep Raja

(Columbia University)

Sergey Samsonov

(National Research University Higher School of Economics)

Roshan Shariff

(University of Alberta)

Sean Sinclair

(Cornell University)

Sharan Vaswani

(MILA)

Ruosong Wang

(Carnegie Mellon University)

Chen-Yu Wei

(University of Southern California)

Gellert Weisz

(DeepMind)

Tengyang Xie

(University of Illinois at Urbana-Champaign)

Andrea Zanette

(Simons Institute, UC Berkeley)

Kaiqing Zhang

(Massachusetts Institute of Technology )

Dongruo Zhou

(UCLA)