Rebel: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Workshop

Adversarial Approaches in Machine Learning

Speaker(s)

Noam Brown (Facebook AI Research)

Location

Calvin Lab Auditorium

Date

Friday, Feb. 25, 2022

Time

11:30 a.m. – 12:15 p.m. PT

Abstract

The combination of deep reinforcement learning and search has led to a number of high-profile successes in perfect-information games like Chess and Go, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games like Poker. In contrast, ReBeL is a general framework for self-play reinforcement learning and search that provably solves any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.

Rebel: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Abstract

Video Recording