Tuesday ML Seminar

Parent Program

Foundations of Machine Learning

Location

Calvin Lab Room 116

Speaker(s)

Ravi Kannan

Date

Tuesday, Feb. 28, 2017

Time

10 a.m. – 12 p.m. PT

Back to calendar

Description

Topic Modeling: From Proof to Practice

Topic Models posit a stochastic generation process for document corpora and devise algorithms to learn the model from real data. Currently, there are two methods of validation: improved efficiency on benchmark corpora up to billions of words and mathematically proven error and time bounds tested on smaller cases. I will present our recent effort where the two meet. The main new algorithm ingredient is an importance sampling procedure inspired by Randomized Linear Algebra. Whereas known topic models posit a near low-rank data matrix, we start with a new high-rank model which allows for realistic noise. The algorithm empirically performs better to scale than the state of the art.

All scheduled dates:

Upcoming

No Upcoming activities yet

Tuesday ML Seminar

Topic Modeling: From Proof to Practice

All scheduled dates:

Upcoming

Past