Monday, May 22nd, 2017

9:00 am9:45 am

No abstract available.

11:30 am12:15 pm

In this talk, we will walk through a case study of how techniques developed to design differentially private algorithms can be brought to bear to design asymptotically dominant strategy truthful mechanisms in large markets, without the need to make any assumptions about the structure of individual preferences. Specifically, we will consider the many-to-one matching problem, and see a mechanism for computing school optimal stable matchings, that makes truthful reporting an approximately dominant strategy for the student side of the market. The approximation parameter becomes perfect at a polynomial rate as the number of students grows large, and the analysis holds even for worst-case preferences for both students and schools. This case study is a special case of a general technique, that has found many applications, and is an invitation to think of more.

Based on joint work with Sampath Kannan, Jamie Morgenstern, and Steven Wu

2:00 pm3:35 pm

John Abowd: Practical Privacy from the Trenches

The U.S. Census Bureau is committed to modernizing all of its disclosure limitation systems using formally private methods. For the 2020 Census of Population and Housing, the Census Bureau is testing a full publication system that is differntially private end-to-end starting with the final form of the edited confidential census enumerations--expected to be about 325 million individual records from 145 million households and 8 million group quarters. If successful, algorithms, implementation details, and all parameter settings will be public. If successful, they will be released beginning with the implementation for the 2018 End-to-End test. Unless another national statistical office has such a system ready to deploy before 2020, if successful, this will be the first formally private data publication system implemented by an official statistical agency for its full publication system from a major product. For a variety of reasons, I am very familiar with this work. I will discuss the issues that can be raised in a public forum.

Gerome Miklau: Principled Evaluation of Differentially Private Algorithms

The increasing complexity of differentially private algorithms poses a challenge for researchers evaluating new technical approaches and for practitioners adapting privacy algorithms to real-world tasks.  In particular, deployment of these algorithms has been slowed by an incomplete understanding of the accuracy penalty they entail.

In this talk I will describe a set of evaluation principles designed to support the sound evaluation of privacy algorithms and I will review the conclusions of a thorough empirical study done in accordance with these principles.  This empirical study is the basis of dpcomp.org, a public web-based system that allows users to interactively explore algorithm output in order to understand, both quantitatively and qualitatively, the error introduced by the algorithms and its dependence on key input parameters.

Our empirical evaluation raises a number of research problems in algorithm design and safe algorithm selection, and I will briefly mention our ongoing efforts to address them.

This talk is based on work joint with Michael Hay, Ashwin Machanavajjhala, Yan Chen, Ios Kotsogiannis, Ryan McKenna, and Dan Zhang.

 

Aleksandra Korolova: Challenges of Applying Differential Privacy: 
from the Industry Trenches

4:00 pm4:50 pm

Frank McSherry: Challenges Transferring Privacy Technology

I'll work through several steps involved in the transfer of privacy ideas and technology, and some issues I've seen along the way. Most of these issues are not of the form "the right theorem does not yet exist", and may require developing a complementary skill set. I'll talk about my experiences collaborating with researchers in adjacent fields, reviewing and critiquing "non-technical" privacy research, and designing and building privacy tools. Most of these have not been a resounding success, but ideally should be instructive nonetheless.

Tuesday, May 23rd, 2017

9:00 am9:45 am

No abstract available.

10:00 am10:50 am

Brendan McMahan: Decentralized Machine Learning and Privacy

Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center and training there using conventional approaches. We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating ephemeral locally-computed updates. We term this decentralized approach Federated Learning. We present a practical method for the federated learning of deep networks based on iterative model averaging. Federated Learning can be complimented by secure aggregation protocols and differential privacy, connections we will also discuss.

11:20 am12:30 pm

Salil Vadhan: Differential Privacy & Statistical Inference – A Theoretical CS Perspective

I will give a theoretical computer scientist’s perspective on the research challenges in developing differentially private algorithms for statistical inference, in particular highlighting ways in which these directions may differ from a typical theoretical computer science mindset (or at least the mindset that I had a few years ago, before learning anything about statistical inference). Many of these challenges are very interesting theoretically as well as being relevant for bringing differential privacy to practice. As illustrative examples, I will discuss the problems of differentially private hypothesis testing and of producing differentially private confidence intervals.

Based in part on joint work with Vishesh Karwa, and on joint work with Gaboardi, Lim, and Rogers, as well as works by many others.

Frauke Kreuter: Gaining Record Linkage Consent: A Summary of Experimental Findings

No abstract available.

3:00 pm3:30 pm

No abstract available.

4:30 pm5:00 pm

In this talk I will introduce the aim of program verification and I will discuss the challenges and benefits of its use in support of differential privacy. I will discuss some of the steps taken so far in this direction, and some of the main challenges for future applications.

5:00 pm5:30 pm

Datasets are often reused to perform multiple statistical analyses in an adaptive way, in which each analysis may depend on the outcomes of previous analyses on the same dataset. Standard statistical guarantees do not account for these dependencies and little is known on how to provably avoid overfitting. In this talk I'll describe recent work that provides a new framework to address this problem. I'll then describe several approaches to the problem based on techniques developed in the context of differential privacy.
Based on joint works with Dwork, Hardt, Pitassi, Reingold, Roth and Steinke.

Wednesday, May 24th, 2017

9:00 am9:45 am

Chris Noofnagle: Research Challenges in Privacy and Security Policy

Abstract: Researchers performing privacy and security forensic analyses with an eye toward policy face several challenges. First, R1 universities used to have the best data and the best tools for research. Increasingly, tools and data reside in the private sector, requiring relationships that may burden research with limits on academic freedom and presenting problems of deep capture. Thus, maintaining academic independence is a growing challenge. Second, companies can design products to leverage copyright and terms-of-use legal protections and possibly prohibit security and privacy forensics. As IoT devices rely on the cloud, they will become increasingly inscrutable. Third, some wish to reorient privacy rules to focus on how data are used rather than whether data were collected. Data use is more difficult to forensically verify than data collection, and there is a need to create tools that document uses and limit data access to specified uses. Finally, several legal challengers allege that no “harm” flows from security breaches or that the government bears a burden to prove harm in an exacting way. Thus, conceptualizing and documenting injury from privacy invasions and insecurity is key for the future of cybersecurity enforcement.

10:45 am11:25 am

Anand Sarwate: Challenges in Privacy-Preserving Learning for Collaborative Research Consortia

Protecting privacy is particularly important in applications involving human health data due to ethical, legal, and institutional regulations. However, in order to learn from larger populations, research institutions need to collaborate by performing joint analyses on locally-held data. While many statistical analyses can be performed such that data holders need only share data derivatives, differentially privacy can give quantifiable privacy protections at the expense of loss in utility/accuracy. Privacy protections can incentivize more institutions to share access to their data. At the same time, typical sample sizes in some applications may be too small to support strong privacy protections, and certain tasks may be more amenable to differential privacy than others. This talk will discuss some of these issues and the corresponding theoretical challenges in the context of designing a collaborative research system for neuroimaging data.

11:25 am12:20 pm

Aaron Roth: (Un)fairness in Machine Learning

In this talk, we will quickly survey some of the sources of "unfairness" in machine learning, and discuss the perspective that theory can bring to what is a messy empirical problem. We will then talk about the consequences of enforcing a family of fairness definitions that we have been calling "weakly meritocratic fairness" on the -process- of learning, in a number of settings.

Based on joint works with Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Sampath Kannan, Michael Kearns, Jamie Morgenstern, Seth Neel, Mallesh Pai, Rakesh Vohra, Steven Wu

2:00 pm3:20 pm

Ed Felten: Reconciling Public Policy with New Theories of Privacy

Most laws and public policies on privacy are based on an outdated and unsound theory of privacy, relying on notions of personally identifiable information (PII). Although the problems of the PII model have been increasingly recognized by policymakers, there is currently no theory that can plausibly replace PII in the policymaking process. This talk will discuss what is missing, and what researchers can do to help close this gap and build a foundation for more sound and effective public policy.

Simson Garfinkel: More Privacy to Formalize

Differential privacy provides a formal definition of data privacy within a database, but experience has shown that it's hard to apply differential privacy beyond structured sets of tabular data and some limited graph databases.  However, there are many kinds of information that require sharing and computation. Simple datatypes include time, geographical, and imagery information. How do you privatize a picture of a crowd? Today practitioners are at a loss for privatizing even many kinds of structured information, such as 3D models or genetic information.  In the cybersecurity world, there is a need to privatize netflow data, cyber threat intelligence, and provenance.  And then there's text. Even if the world of tabular databases, we still lack tools for applying differential privacy to high-dimensional data. Differential privacy doesn't seem to have a concept of group privacy. Finally, while differential privacy does give us tools for private data publishing, it is silent on the privacy of data users. 
 
Simson Garfinkel will present a slide for each of these examples, discussing how it would be really neat to privatize this kind of data, but no recommendations on how to addresses these open problems.
 

Frauke Kreuter: Data Collection, Privacy, Consent and Bias

No abstract available

Nina Taft: Privacy Advisor and Incentivizing Privacy-responsible Behavior

No abstract available

3:50 pm4:50 pm

No abstract available.