Training Large Language Models: Practices and Research Questions

Workshop

Special Year on Large Language Models and Transformers, Part 1 Boot Camp

Speaker(s)

Danqi Chen (Princeton University)

Location

Calvin Lab Auditorium

Date

Thursday, Sept. 5, 2024

Time

2 – 3:30 p.m. PT

Abstract

In this tutorial, I will provide a comprehensive walk-through of the pipeline for training large language models, covering both pre-training and post-training phases. My goal is to discuss the best practices at each stage of training as known today—drawing from open models and public research papers—including data curation, training algorithms, and safety mitigations. The tutorial aims to serve as a starting point to facilitate discussions on the open research questions in training the next generation of large language models.

Attachment

LLM24-BC Slides - Danqi Chen.pdf

Training Large Language Models: Practices and Research Questions

Abstract

Attachment

Video Recording