In this tutorial we'll survey the optimization viewpoint to learning. We will cover optimization-based learning frameworks, such as online learning and online convex optimization. These will lead us to describe some of the most commonly used algorithms for training machine learning models.

### Monday, January 23rd, 2017

In this tutorial we'll survey the optimization viewpoint to learning. We will cover optimization-based learning frameworks, such as online learning and online convex optimization. These will lead us to describe some of the most commonly used algorithms for training machine learning models.

Many problems in machine learning that involve discrete structures or subset selection may be phrased in the language of submodular set functions. The property of submodularity, also referred to as a 'discrete analog of convexity', expresses the notion of diminishing marginal returns, and captures combinatorial versions of rank and dependence. Submodular functions occur in a variety of areas including graph theory, information theory, combinatorial optimization, stochastic processes and game theory. In machine learning, they emerge as the potential functions of graphical models, as utility functions in active learning and sensing, in models of diversity, in structured sparse estimation, matrix approximations or network inference. The lectures will give an introduction to the theory of submodular functions, example applications in machine learning, algorithms for submodular optimization, and current research directions.

Many problems in machine learning that involve discrete structures or subset selection may be phrased in the language of submodular set functions. The property of submodularity, also referred to as a 'discrete analog of convexity', expresses the notion of diminishing marginal returns, and captures combinatorial versions of rank and dependence. Submodular functions occur in a variety of areas including graph theory, information theory, combinatorial optimization, stochastic processes and game theory. In machine learning, they emerge as the potential functions of graphical models, as utility functions in active learning and sensing, in models of diversity, in structured sparse estimation, matrix approximations or network inference. The lectures will give an introduction to the theory of submodular functions, example applications in machine learning, algorithms for submodular optimization, and current research directions.

### Tuesday, January 24th, 2017

No abstract available.

No abstract available.

No abstract available.

The problem of building an autonomous robot has traditionally been viewed as one of integration: connecting together modular components, each one designed to handle some portion of the perception and decision making process. For example, a vision system might be connected to a planner that might in turn provide commands to a low-level controller that drives the robot's motors. In this talk, I will discuss how ideas from deep learning can allow us to build robotic control mechanisms that combine both perception and control into a single system. This system can then be trained end-to-end on the task at hand, in effect allowing the entire robotic perception and control system to be learned. I will show how this end-to-end approach actually simplifies the perception and control problems, by allowing the perception and control mechanisms to adapt to one another and to the task. I will also present some recent work on scaling up deep robotic learning, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination using deep convolutional neural networks.

### Wednesday, January 25th, 2017

Nonparametric Bayesian methods make use of infinite-dimensional mathematical structures to allow the practitioner to learn more from their data as the size of their data set grows. The underlying mathematics is the theory of stochastic processes, with fascinating connections to combinatorics, graph theory, functional analysis and convex analysis. In this tutorial, we'll introduce such foundational nonparametric Bayesian models as the Dirichlet process and Chinese restaurant process and we will discuss the wide range of models captured by the formalism of completely random measures. We'll present some of the algorithms used for posterior inference in nonparametric Bayes, and we will discuss some open theoretical problems.

Nonparametric Bayesian methods make use of infinite-dimensional mathematical structures to allow the practitioner to learn more from their data as the size of their data set grows. The underlying mathematics is the theory of stochastic processes, with fascinating connections to combinatorics, graph theory, functional analysis and convex analysis. In this tutorial, we'll introduce such foundational nonparametric Bayesian models as the Dirichlet process and Chinese restaurant process and we will discuss the wide range of models captured by the formalism of completely random measures. We'll present some of the algorithms used for posterior inference in nonparametric Bayes, and we will discuss some open theoretical problems.

Nonparametric Bayesian methods make use of infinite-dimensional mathematical structures to allow the practitioner to learn more from their data as the size of their data set grows. The underlying mathematics is the theory of stochastic processes, with fascinating connections to combinatorics, graph theory, functional analysis and convex analysis. In this tutorial, we'll introduce such foundational nonparametric Bayesian models as the Dirichlet process and Chinese restaurant process and we will discuss the wide range of models captured by the formalism of completely random measures. We'll present some of the algorithms used for posterior inference in nonparametric Bayes, and we will discuss some open theoretical problems.

### Thursday, January 26th, 2017

No abstract available.

No abstract available.

No abstract available.

No abstract available.

### Friday, January 27th, 2017

This tutorial surveys algorithms for learning latent variable models based on the method-of-moments, focusing on algorithms based on low-rank decompositions of higher-order tensors. The target audiences of the tutorial include (i) users of latent variable models in applications, and (ii) researchers developing techniques for learning latent variable models. The only prior knowledge expected of the audience is a familiarity with simple latent variable models (e.g., mixtures of Gaussians), and rudimentary linear algebra and probability. The audience will learn about new algorithms for learning latent variable models, techniques for developing new learning algorithms based on spectral decompositions, and analytical techniques for understanding the aforementioned models and algorithms. Advanced topics such as learning overcomplete representations may also be discussed.

This tutorial surveys algorithms for learning latent variable models based on the method-of-moments, focusing on algorithms based on low-rank decompositions of higher-order tensors. The target audiences of the tutorial include (i) users of latent variable models in applications, and (ii) researchers developing techniques for learning latent variable models. The only prior knowledge expected of the audience is a familiarity with simple latent variable models (e.g., mixtures of Gaussians), and rudimentary linear algebra and probability. The audience will learn about new algorithms for learning latent variable models, techniques for developing new learning algorithms based on spectral decompositions, and analytical techniques for understanding the aforementioned models and algorithms. Advanced topics such as learning overcomplete representations may also be discussed.

Building systems that can understand human language---being able to answer questions, follow instructions, carry on dialogues---has been a long-standing challenge since the early days of AI. Due to recent advances in machine learning, there is again renewed interest in taking on this formidable task. A major question is how one represents and learns the semantics (meaning) of natural language, to which there are only partial answers. The goal of this tutorial is (i) to describe the linguistic and statistical challenges that any system must address; and (ii) to describe the types of cutting edge approaches and the remaining open problems. Topics include distributional semantics (e.g., word vectors), frame semantics (e.g., semantic role labeling), model-theoretic semantics (e.g., semantic parsing), the role of context, grounding, neural networks, latent variables, and inference. The hope is that this unified presentation will clarify the landscape, and show that this is an exciting time for the machine learning community to engage in the problems in natural language understanding.