Size (OOD) Generalization of Neural Models via Algorithmic Alignment

Abstract

Size (or length) generalization is a key challenge in designing neural modules to perform algorithmic tasks. Specifically, when can a neural model with bounded complexity generalize to problem instances of arbitrary size? The size generalization, is a special case of out-of-distribution (OOD) generalization. In this talk, I will focus on approaches to achieve size generalization by "aligning" the neural models with certain algorithmic structures. I will first present a theoretical result to show the benefit of algorithmic alignment: Specifically, we will show that a combination of algorithmic alignment, sparsity regularization, and a carefully curated small training set indeed enables provable size/length generalization in approximating the Bellman-Ford algorithm on arbitrary graphs. We will then present examples of designing practical and efficient neural models for a family of geometric optimization problems via algorithmic alignments. This talk is based on joint work with Riley Nerem, Samantha Chen, Sanjoy Dasgupta, and Anastasios Sidiropoulos.