A Simple Condition for Constant Regret in Online Decision-Making

Abstract

We consider a finite time-horizon T over which a decision-maker observes samples arriving independently from a known distribution. In each iteration, an irreversible decision needs to be made; examples of such problems include online versions of bin packing, knapsack, and revenue management, as well as many other problems of practical interest. In all of these settings the goal is to find a solution that has small loss relative to that of an offline-optimal solution, i.e., one in which all samples are known a priori. We provide a simple, yet general, condition under which the Bayesian prophet framework, introduced by Vera & Banerjee (2018), is guaranteed to yield constant regret, and show how it applies to different problems.

Joint work with Sid Banerjee (Cornell).

A Simple Condition for Constant Regret in Online Decision-Making

Abstract

Video Recording