Alignment Problems in AI Governance

Remote video URL

In an event last month comprising short talks and dialogue, Simons Institute Law and Society fellows Rui-Jie Yew and Greg Demirchyan explored key challenges of alignment in AI governance.

First, we currently lack a thorough understanding of AI models, making it difficult to identify the risks they pose. Furthermore, existing auditing tools may not be sufficiently reliable to offer assurances of model safety and alignment. Even when explanations appear compelling, establishing their faithfulness remains hard to achieve, especially at scale. These technical challenges present regulatory difficulties for current governance proposals and can expose the misalignment between proposed interventions and their feasibility in achieving their regulatory objectives.

Secondly, another challenge lies in the unintended effects of how AI systems might be designed, deployed, and framed to minimize regulatory costs. While methods in the technological tool kit of safety, like privacy-preserving technologies and AI evaluations, are framed as safety enhancing, they also simultaneously shape the terms of regulatory oversight for AI systems and can be developed and deployed in misalignment with the goals of regulation.

Yet, despite these challenges, it is critical to continue to develop governance mechanisms. The speakers’ aims were to discuss approaches to governance that attempt to reduce these tensions by (1) designing regulatory systems that are sufficiently adaptable as our understanding of this transformative technology improves and (2) presenting steps toward the robust oversight of AI systems.

,