Abstract

Factorized databases are compressed yet lossless representations of relational data that allow for efficient processing in the compressed domain. They are relational algebra expressions built using the union operator, the Cartesian product operator and data values. By exploiting the distributivity of product over union, they avoid the redundancy in the tabular representation of relational data. Factorized representations of join result can be computed directly from the input database and in time proportional to their sizes and the input database size. Since their introduction a decade ago, there has been great progress on the theory, systems and applications of factorized databases. We explain the foundations of factorized databases and overview some of their applications in the fields of static and dynamic relational algorithms, probabilistic databases, and in-database machine learning.

Attachment

Video Recording