The State of Protein Structure Prediction and Friends

Workshop

AI≡Science: Strengthening the Bond Between the Sciences and Artificial Intelligence

Speaker(s)

Mohammed AlQuraish (Columbia)

Location

Calvin Lab Auditorium

Date

Tuesday, June 11, 2024

Time

9 – 10 a.m. PT

Abstract

AlphaFold2 revolutionized structural biology by accurately predicting protein structures from sequence. Its implementation however (i) lacks the code and data required to train models for new tasks, such as predicting alternate protein conformations or antibody structures, (ii) is unoptimized for commercially available computing hardware, making large- scale prediction campaigns impractical, and (iii) remains poorly understood with respect to how training data and regimen influence accuracy. Here we report OpenFold, an optimized and trainable version of AlphaFold2. We train OpenFold from scratch and demonstrate that it fully reproduces AlphaFold2's accuracy. By analyzing OpenFold training, we find new relationships between data size/diversity and prediction accuracy and gain insights into how OpenFold learns to fold proteins during its training process.

The State of Protein Structure Prediction and Friends

Abstract

Video Recording