Abstract

Many real-time sequential decision making problems can be modeled as online optimal control problems with a look-ahead window of predictions. In this talk, we focus on online optimal control problems with time-varying convex cost functions and linear time-invariant dynamical system. We consider a finite prediction window of the future cost functions and focus on dynamic regret (aka, competitive difference) analysis. We show how the prediction improves the regret for any given online algorithm without using prediction by conducting additional W step of gradient calculations. We also characterize a fundamental lower bound of dynamic regrets for any online algorithm. The lower bound and the upper bound of the designed online algorithm almost match each other, implying that our algorithm achieves a near-optimal performance with respect to dynamic regrets.