Abstract

This part of the tutorial covers the fundamentals of Markov decision processes, providing a frame for the discussion of reinforcement learning in the next two parts. We will present the Bellman equation and its key properties, and show how they give rise to the commonly used computational methods such as value iteration and policy iteration.

Attachment

Video Recording