Abstract

 I investigate whether the standard dimensionality reduction technique of PCA inadvertently produces data representations with different fidelity for two different populations. I show on several real-world data sets that PCA has higher reconstruction error on population A than on B (for example, lower- versus higher-educated individuals). This motivates our study of dimensionality reduction techniques which maintain similar fidelity for A and B. We define the notion of Fair PCA, and more generally the multi-criteria dimensionality reduction where we are given multiple objectives that need to be optimized simultaneously. Our main result is an exact polynomial-time algorithm for the two-criterion dimensionality reduction problem when the two criteria are increasing concave functions. Prominent applications of this result are polynomial-time algorithms for Fair PCA and Nash Social Welfare objectives. I conclude with experiments indicating the effectiveness of our algorithms based on extreme point solutions of semi-definite programs on several real-world datasets. Joint work with Tao Tantipongpipat, Mohit Singh, Jamie Morgenstern, and Santosh Vempala.