Abstract

As we deploy large language models to an increasing range of tasks, we often encounter a severe mismatch between the likelihood-based training objective and our real goal---helping the user get what they want. This mismatch makes our systems less useful and exacerbates ethical and safety concerns. In this talk I'll frame and motivate the general problem of aligning ML objectives, and describe some recent work at OpenAI which suggest that alignment can have very large practical benefits for modern language models.