Thursday ML Seminar

Parent Program

Foundations of Machine Learning

Location

Calvin Lab Room 116

Speaker(s)

Shai Ben-David (University of Waterloo)

Date

Thursday, Feb. 9, 2017

Time

10 a.m. – 12 p.m. PT

Back to calendar

Description

Two Paradigms of Semi-Supervised Active Clustering with Sample and Computational Complexity Bounds

I will describe two paradigms for clustering that incorporates domain expertise through an interaction between the clustering algorithms and the user of the clustering.

The first paradigm asks the user to cluster a small random sample of the data and uses that clustering to construct a metric that reflects the user's preferences (and is subsequently used to cluster the data). For this paradigm we show sample complexity bounds that are independent of the input data size.

The second paradigm uses interactive must-link/can't-link queries to speed up centre-based clustering tasks. We show that for data sets satisfying some clusterabiity condition, a logarithmic number of such queries can turn an otherwise NP-hard clustering task into one solvable in pseudo-linear time.

I would also like to devote some time to an open discussion of the role of domain knowledge and interaction in clustering.

The first work is with Hassan Zokaei Ashtiani and was presented in UAI 2015. The second result is with Hassan and Shrinu Kushagra and was presented in NIPS 2016.

All scheduled dates:

Upcoming

No Upcoming activities yet

Thursday ML Seminar

Two Paradigms of Semi-Supervised Active Clustering with Sample and Computational Complexity Bounds

All scheduled dates:

Upcoming

Past