Large Language Models Meet Copyright Law
Is ingesting in-copyright works posted on the open internet as training data for building large language models copyright infringement or not? If courts decide that ingestion is infringement, they have the power to order the destruction of models trained on this data, although this is discretionary, not mandatory. The stakes for this nascent industry and for researchers in the resolution of this issue could not be greater. Presented by Pamela Samuelson (Berkeley Law) as part of the Simons Institute’s workshop on Large Language Models and Transformers.