Hypothesis selection with computational constraints

Abstract

Hypothesis selection is a fundamental problem in learning theory and statistics. Given a dataset and a finite set of candidate distributions, the goal is to select a distribution that matches the data as well as possible. More specifically, suppose we have sample access to an unknown distribution $P$ over a domain $\mathcal{X}$. We assume that $P$ is well-approximated by one of $n$ distributions in a class (called hypotheses), denoted $\mathcal{H} \coloneqq {H_1, H_2, \ldots, H_n}$. The goal is to design an algorithm that outputs a distribution $\hat{H} \in \mathcal{H}$ whose total variation distance from $P$ is nearly minimal. In this work, we study the hypothesis selection problem under memory and time constraints. In particular, we discuss how to achieve a nearly optimal tradeoff between memory usage and sample complexity. Additionally, we explore methods to obtain optimal accuracy via algorithms with nearly optimal time complexity.

Attachment

Slides

Hypothesis selection with computational constraints

Abstract

Attachment

Video Recording