Abstract
Most of machine learning theory assumes i.i.d. (independent identically distributed) data. However, the i.i.d. assumption is rarely satisfied in practice. Online learning departs from this assumption by changing the evaluation measure from expected error on future samples to regret with respect to the best prediction strategy in hindsight, out of a limited set of strategies. The revised evaluation measure allows to go as far as to consider adversarial environments, for example spam filtering or learning in two-player games. The ability to learn in adversarial environments comes at a price though. The convergence rate of algorithms for adversarial environments are typically much slower than the convergence rate of algorithms for i.i.d. environments. A practitioner thus has to make a choice between safe algorithms for adversarial environments with slow convergence rate and fast algorithms for i.i.d environments, with the implied risk of complete failure if the model assumptions are violated. But does the practitioner have to take the risk? Can we design algorithms that adapt to the nature of the environment?