Abstract
We show how to optimally regulate prediction algorithms in a world where (a) high-stakes decisions such as lending, medical testing, or hiring are made by complex "black box" prediction functions, (b) there is an incentive conflict between the agent who designs the prediction function and a principal who oversees the use of the algorithm, and (c) the principal is limited in how much she can learn about the agent's black-box model. We show that limiting agents to prediction functions that are simple enough to be fully transparent is inefficient as long as the bias induced by misalignment between the principal's and the agent's preferences is small relative to the uncertainty about the true state of the world. Algorithmic audits can improve welfare, but the gains depend on the design of the audit tools. Tools that focus on minimizing overall information loss, the focus of many post hoc explainer tools, will generally be inefficient since they focus on explaining the average behavior of the prediction function rather than those aspects that are most indicative of a misaligned choice. Targeted tools that focus on the source of incentive misalignment — e.g., excess false positives or racial disparities — can provide first-best solutions. We provide empirical support for our theoretical findings using an application in consumer lending.