Results 211 - 220 of 23739
Machine learning models must balance accuracy and fairness, but these goals often conflict, particularly when data come from multiple demographic groups. A useful tool for understanding this trade-off is the fairness-accuracy (FA) frontier, which characterizes the set of models that cannot be simultaneously improved in both fairness and accuracy. Prior analyses of the FA frontier provide a full characterization under the assumption of complete knowledge of population distributions, an unrealistic ideal. We study the FA frontier in the finite-sample regime, showing how it deviates from its population counterpart and quantifying the worst-case gap between them. In particular, we derive minimax-optimal estimators that depend on the designer's knowledge of the covariate distribution. For each estimator, we characterize how finite-sample effects asymmetrically impact each group's risk, and identify optimal sample allocation strategies. Our results transform the FA frontier from a theoretical construct into a practical tool for policymakers and practitioners who must often design algorithms with limited data.
Multi-agent learning faces a fundamental tension: leveraging distributed collaboration without sacrificing the personalization needed for diverse agents. This tension intensifies when aiming for full personalization while adapting to unknown heterogeneity levels—gaining collaborative speedup when agents are similar, without performance degradation when they are different. Embracing the challenge, we propose personalized collaborative learning (PCL), a novel framework for heterogeneous agents to collaboratively learn personalized solutions with seamless adaptivity. Through carefully designed bias correction and importance correction mechanisms, our method AffPCL robustly handles both environment and objective heterogeneity. We prove that AffPCL reduces sample complexity over independent learning by a factor of $\max\{n^{-1}, \delta\}$, where $n$ is the number of agents and $\delta\in[0,1]$ measures their heterogeneity. This *affinity-based* acceleration automatically interpolates between the linear speedup of federated learning in homogeneous settings and the baseline of independent learning, without requiring prior knowledge of the system. Our analysis further reveals that an agent may obtain linear speedup even by collaborating with arbitrarily dissimilar agents, unveiling new insights into personalization and collaboration in the high heterogeneity regime.
Classical supervised learning starts from a collection of input-output data pairs corresponding to a well defined task. Predictive accuracy then scales as the number of datapoints, the size of the function class, and the number of optimization steps all increase in appropriate relative proportions. A new learning and prediction paradigm has emerged and gained momentum over the last decade, where one can learn a highly predictive model without task specific data or even a formal definition of the target task. This paradigm, called zero shot prediction, has been successful both for known and unsuspected reasons. Starting from its origins in matching words and pictures, through recent self supervised and contrastive learning from different modalities or sources, I will describe key components driving its empirical performance in AI domains such as computer vision and language modeling.
This is based on joint work with Ronak Mehta, Lang Liu, and Soumik Pal, on data balancing <https://openreview.net/forum?id=8CBPnENQyV> and on zero-shot prediction <https://openreview.net/forum?id=kJQgMGLrow>.
Multitask Learning refers to the problem of aggregating many datasets from separate source distributions to improve performance on a target prediction task. Our aim is to understand (1) sufficient and necessary conditions for speedup in convergence rates over vanilla prediction with just the target data, (2) how such speedup depends on the number of datasets and samples per dataset, and (3) whether such speedup is achievable adaptively, i.e., by procedures with no prior distributional information.
The picture turns out to be mixed as the problem displays sharp gaps between oracle rates and adaptive rates, i.e., there exist situations where no procedure can do better than using the target data alone even though a large subset of datasets are informative about the target task. On the other hand, a bit of information on the relation between sources and target distributions can allow for near optimal adaptive rates. These results lead to many interesting new questions which I'll attempt to properly convey.
The talk is based on various works with collaborators such as S. Hanneke, A. Gretton, M. Z. Li, D. Meunier.
Siqi Liu is an assistant professor of computer science at Duke University. Her research focuses on theoretical computer science, particularly the construction of high-dimensional expanders, their connections to manifolds, and applications in coding theory...
Connor Wagaman is a PhD student in computer science at Boston University, where he is advised by professors Adam Smith and Marco Gaboardi. Connor’s research deals primarily with data privacy (e.g., differential privacy), with a focus on privacy for graphs...
André Schrottenloher is a full-time researcher at the Inria Center at the University of Rennes, France. He completed his PhD thesis in 2021 at the Inria Center of Paris and was a post-doctoral researcher at CWI in Amsterdam between 2021 and 2022. His main...