We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs do not rely on the dominant Fourier-based methods from computational learning theory. Instead, we derive our approximations using techniques related to the classical method of moments.
The main new application is a polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces (i.e., depth-2 neural networks) with respect to any log-concave distribution on R^n (for any constant accuracy parameter). This result was not known even for the case of PAC learning the intersection of two halfspaces.
Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to any distribution that obeys a sub-exponential tail bound (i.e., sub-exponential or sub-gaussian densities).
Joint work with Raghu Meka.