Because of the uncertainty caused by COVID-19, it is still unclear if this workshop will take place in-person or online only. Even an in-person version will have significantly reduced capacity; in-person attendance is expected to be limited to long-term program participants. In any case, the workshop will be open to the public for online participation. Please register to receive the zoom webinar access details. This page will be updated as soon as we have more information.
Reinforcement learning is but one way to model interactions with a dynamic environment. Indeed, online algorithms have a long history in the theoretical computer science community, and many of the concepts (such as regret minimization and competitive ratios) have produced very successful algorithms and design principles (such as exponential weights and choosing actions optimistically). At a high level, this workshop aims to involve the theoretical computer science community by asking which classical online learning tools can be successfully applied to reinforcement learning problems and particularly the problem of exploration. Of particular interest are the tools for designing and analyzing algorithms that are robust to non-stochastic and adversarial data. The bandit problem is well understood in both stochastic and non-stochastic cases, but what is the right approach to "robustify'' exploration in reinforcement learning? One can generalize bandit algorithms to reinforcement learning by considering exploration over a parametric class of Markov decision processes; however, this class is a generalization of linear bandits which is not fully solved, as finite-time, structure-dependent minimax algorithms are unknown. Finally, the "elephant in the room" is to develop flexible methods that scale for a large class of environments and are not sensitive to the so-called realizability assumption.
Further details about this workshop will be posted in due course. Enquiries may be sent to the organizers workshop-rl2 [at] lists.simons.berkeley.edu (at this address).
Naman Agarwal (Princeton University), Pierre Alquier (RIKEN / The University of Tokyo), Hamsa Bastani (Upenn), Simina Brânzei (Purdue University), Shuchi Chawla (University of Wisconsin, Madison), David Goldberg (Cornell), Negin Golrezaei (MIT), Steffen Grünewälder (Lancaster University), Anupam Gupta (Carnegie Mellon University), Sham Kakade (University of Washington), Emilie Kaufmann (Inria), Wouter Koolen (Centrum Wiskunde & Informatica), Akshay Krishnamurthy (Microsoft Research), Lihong Li (Google Brain), Michael Littman (Brown University), Yishay Mansour (Tel Aviv Univ and Google Research), Vianney Perchet (Université Paris Diderot - Paris 7), Ciara Pike-Burke (Universitat Pompeu Fabra), Yuval Rabani (The Hebrew University of Jerusalem), Lillian Ratliff (University of Washington), Benjamin Recht (UC Berkeley), Daniel Russo (Columbia University), Alex Slivkins (Microsoft Research New York), Éva Tardos (Cornell University), Mengdi Wang (Princeton University), Yisong Yue (Caltech)