Abstract

I'll discuss a scheme for inserting a statistical watermark into the outputs of LLMs, which I developed while working at OpenAI. I'll place this in the context of other theoretical and empirical work on LLM watermarking over the past year, as well as other approaches to the AI attribution problem. I'll also say something about the challenges of deployment, and the unsolved technical problem of designing a text watermarking method that resists translation, paraphrasing, and similar attacks.

Video Recording