Abstract

We consider the problem of predicting a label for an individual given the information about the person and her position within a network. Such a fusion of the two sources of information naturally arises in a variety of applications, including recommendation systems, ad placement, and personalized medical care. When formalizing this problem, one faces a computationally intractable combinatorial objective. We present an unexpected phenomenon: it is possible to develop poly-time (and statistically near-optimal) online prediction methods even when the offline problem is provably hard. These prediction methods arise in a systematic way from a new relaxation framework that has roots in the work of Cover and Blackwell. The new applications to learning on graphs are based in part on multiplicative approximation algorithms developed by the theoretical computer science community. Our approach naturally extends to the contextual multi-armed bandit setting with large sets of policies -- a notoriously difficult problem, which is often encountered in real-world applications. Joint work with K. Sridharan.

Video Recording