Skip to main content

Utility navigation

  • Calendar
  • Contact
  • Login
  • MAKE A GIFT
Berkeley University of California
Home Home

Main navigation

  • Programs & Events
    • Research Programs
    • Workshops & Symposia
    • Public Lectures
    • Research Pods
    • Internal Program Activities
    • Algorithms, Society, and the Law
  • Participate
    • Apply to Participate
    • Propose a Program
    • Postdoctoral Research Fellowships
    • Law and Society Fellowships
    • Science Communicator in Residence Program
    • Circles
    • Breakthroughs Workshops and Goldwasser Exploratory Workshops
  • People
    • Scientific Leadership
    • Staff
    • Current Long-Term Visitors
    • Research Fellows
    • Postdoctoral Researchers
    • Scientific Advisory Board
    • Governance Board
    • Affiliated Faculty
    • Science Communicators in Residence
    • Law and Society Fellows
    • Chancellor's Professors
  • News & Videos
    • News
    • Videos
  • Support for the Institute
    • Annual Fund
    • All Funders
    • Institutional Partnerships
  • For Visitors
    • Visitor Guide
    • Plan Your Visit
    • Location & Directions
    • Accessibility
    • Building Access
    • IT Guide
  • About

Results 2311 - 2320 of 23900

Video
|
Apr. 7, 2025
Reducing the Dimension of Language: A Spectral Perspective on Transformers
Video
|
Apr. 7, 2025
LLM skills and meta-cognition: scaffolding for new forms of learning?
Video
|
Apr. 7, 2025
Predicting and optimizing the behavior of large ML models
Workshop Talk
|
Apr. 4, 2025

Learning Generative Models from Corrupted Data

The quality of generative models depends on the quality of the data on which they are trained. Access to high-quality data is scarce and expensive, while noisy samples are generally more accessible. State-of-the-art generative models are often trained on curated datasets that emerge from highly filtered data pools from the Web and other sources. In this talk, we will show that there is immense value in the lower quality data that are often discarded. We will present an algorithmic framework to train generative models using a combination of a small set of expensive, high-quality samples and a large set of cheap, noisy points. Our framework is instantiated for diffusion generative models, specifically through our Ambient Diffusion method. We will show how Ambient Diffusion enables training on noisy images and that it achieves state-of-the-art performance in de novo protein design. Time permitting, we will also present preliminary extensions to autoregressive language modeling and discuss broader implications for memorization, dataset design, and model performance.

Workshop Talk
|
Apr. 4, 2025

The Future of Language Models: A Perspective on Evaluation

The progress in techniques to evaluate LLMs has regrettably fallen behind the progress in LLM development, making it challenging to quantify progress. My research calls for rethinking the fundamental principles underlying the evaluation of Transformer-based language models. I will discuss some work on applying language models to real tasks, as well as the selection of test data for efficient and robust evaluation. I will present challenges that arise when we do not know what the ground truth might be. Finally, I will discuss some ideas on evaluating Transformer language models without involving language.

Workshop Talk
|
Apr. 4, 2025

SILO Open LM: Training LMs on Siloed Datasets

Abstract not available.

Workshop Talk
|
Apr. 4, 2025

The Power of Resets! Learning better, one reset at a time

ost-training is essential for enhancing large language model (LLMs) capabilities and aligning them to human preferences. One of the most
widely used post-training techniques is reinforcement learning from human feedback (RLHF). In this talk, I will first discuss the challenges of applying RL to LLM training. Next, I will introduce RL algorithms that tackle these challenges by utilizing key properties of the underlying problem. Additionally, I will present an approach that
simplifies the RL policy optimization process for LLMs to relative reward regression.

Workshop Talk
|
Apr. 4, 2025

Towards sequence-to-sequence models without activation functions

Activation functions play a pivotal role in deep neural networks, enabling them to tackle complex tasks like image recognition. However, activation functions also introduce significant challenges for deep learning theory, network dynamics analysis, and properties such as interpretability and privacy. In this talk, we revisit the necessity of activation functions, especially in cases where high-order interactions among the input elements are used, such as in the attention mechanism. Specifically, we highlight how high-order interactions are sufficient for retaining the necessary expressivity. Yet, the question remains: Is this expressivity alone sufficient for effective learning? We highlight networks that achieve strong performance both in demanding static tasks, such as ImageNet recognition, and sequence-to-sequence tasks, such as arithmetic tasks and language modeling.

Workshop Talk
|
Apr. 3, 2025

Predicting and optimizing the behavior of large ML models

In this talk, we study the problem of predicting (and optimizing) the counterfactual behavior of large-scale ML models. We start by focusing on “data counterfactuals,” where the goal is to estimate the effect of modifying a training dataset on the resulting machine learning outputs (and conversely, to design datasets that induce specific desired behavior). We introduce a method that almost perfectly estimates such counterfactuals, unlocking some new possibilities in the design and evaluation of ML models, including state-of-the-art data attribution, selection, and poisoning.

Workshop Talk
|
Apr. 3, 2025

Transformers can learn compositional function

Abstract not available.

Pagination

  • Previous page Previous
  • Page 230
  • Page 231
  • Current page 232
  • Page 233
  • Page 234
  • Next page Next
Home
The Simons Institute for the Theory of Computing is the world's leading venue for collaborative research in theoretical computer science.

Footer

  • Programs & Events
  • Participate
  • Workshops & Symposia
  • Contact Us
  • Calendar
  • Accessibility

Footer social media

  • Twitter
  • Facebook
  • Youtube
© 2013–2026 Simons Institute for the Theory of Computing. All Rights Reserved.
link to homepage

Main navigation

  • Programs & Events
    • Research Programs
    • Workshops & Symposia
    • Public Lectures
    • Research Pods
    • Internal Program Activities
    • Algorithms, Society, and the Law
  • Participate
    • Apply to Participate
    • Propose a Program
    • Postdoctoral Research Fellowships
    • Law and Society Fellowships
    • Science Communicator in Residence Program
    • Circles
    • Breakthroughs Workshops and Goldwasser Exploratory Workshops
  • People
    • Scientific Leadership
    • Staff
    • Current Long-Term Visitors
    • Research Fellows
    • Postdoctoral Researchers
    • Scientific Advisory Board
    • Governance Board
    • Affiliated Faculty
    • Science Communicators in Residence
    • Law and Society Fellows
    • Chancellor's Professors
  • News & Videos
    • News
    • Videos
  • Support for the Institute
    • Annual Fund
    • All Funders
    • Institutional Partnerships
  • For Visitors
    • Visitor Guide
    • Plan Your Visit
    • Location & Directions
    • Accessibility
    • Building Access
    • IT Guide
  • About

Utility navigation

  • Calendar
  • Contact
  • Login
  • MAKE A GIFT
link to homepage