Abstract
We present a general problem that we call Functional Aggregate Queries (or FAQs), which as special cases includes frequently asked questions in CSPs, PGMs, Databases, Logic and Matrix Operations. The problem is to compute sums of products of functions over semi-rings. We present a single simple algorithm to solve this general problem that in addition to re-proving a bunch of known results (e.g. our algorithm specialized to computing DFT results in the FFT) also proves new results in counting CSPs with quantification and exact probabilistic inferences in probabilistic graphical model (PGMs).
Our algorithm has its origin in algorithms designed to compute the natural join query. The natural join query is a fundamental operation in relational database systems (RDBMs). It has been optimized for over 40 years in commercial RDBMs but surprisingly, we only recently have an optimal understanding of the worst-case complexity of computing the join query. This talk will present an information theoretic proof of an optimal worst-case bound on output size of any join query using Shearer's lemma (due to Atserias, Grohe and Marx, 2009) and our algorithmic proof of the same result from 2012. Based on joint works with Abo Khamis, Ngo, Nguyen, Porat and Re.