Feature learning forms the cornerstone for tackling challenging classification problems in domains such as speech, computer vision and natural language processing. While traditionally, features were hand-crafted, the modern approach is to automatically learn good features through deep learning or other frameworks. Feature learning can exploit unlabeled samples, which are usually present in larger amounts, for improved classification performance.
In this talk, we provide a concrete theoretical framework for obtaining informative features which can be used to learn a discriminative model for the label given the input. We show that (higher order) Fisher score functions of the input are informative features, and we provide a differential operator interpretation. We show that given access to these score features, we can obtain the (expected) derivatives of the label as a function of the input (or some model parameters). Having access to these derivatives forms the key to learning complicated discriminative models such as multi-layer neural networks and mixture of classifiers. Thus, the main ingredient for learning discriminative models lies in accurate unsupervised estimation of (higher order) score functions of the input. This is joint work with my students Majid Janzamin and Hanie Sedghi.