IT Seminar

Parent Program

Information Theory

Location

2nd floor interaction area

Speaker(s)

Mesrob Ohannessian (UC San Diego)

Date

Wednesday, Mar. 25, 2015

Time

2:30 – 4 p.m. PT

Back to calendar

Description

Good-Turing: The Good, The Bad, and The Ugly

The "missing mass" is the probability of all unseen symbols in i.i.d. samples from a discrete distribution. It captures a very fundamental notion of rare event. Those who attended Alon Orlitsky's talk earlier this month witnessed the glory of the Good-Turing estimator of the missing mass. In this talk, I will first dismantle this impeccable image. In particular, I will show that Good-Turing can fail to learn the missing mass in relative error, for even the simplest light-tailed distributions. I will then reconstruct a new reputation for this old estimator, as a highly effective specialized rare probability estimator for heavy-tailed distributions. This explains its success in areas where these distributions arise, such as in natural language modeling. This change in perspective opens the door to streamlined estimation techniques that are inspired by extreme value theory and that extend far beyond missing mass estimation.

All scheduled dates:

Upcoming

No Upcoming activities yet

IT Seminar

Good-Turing: The Good, The Bad, and The Ugly

All scheduled dates:

Upcoming

Past