Abstract

The advent of large scale generative artificial intelligence has rapidly changed the landscape of machine learning and even computer science as a whole. While this is often credited to the transformer architecture being special, Stella will argue that the transformer has actually been the vehicle for a paradigm shift in how machine learning processes are developed. She will then discuss how this paradigm shift has created new research questions, sharing results from several of her recent papers ranging in topic from memorization to mechanistic interpretability to evaluation science.

Video Recording