Abstract

When can we simulate a lexicographically-sorted array of answers to a conjunctive query with grouping and aggregation with near-optimal time guarantees? (that is, quasilinear preprocessing and logarithmic time per access.) For most common aggregate functions (e.g., min, max, count, sum), such a query can be phrased as an ordinary conjunctive query over a database annotated with a commutative semiring. As a consequence, we show that the known classification for conjunctive queries without aggregation continues to hold with aggregation as long as the aggregation is not part of the lexicographic order. On the other hand, we show infeasibility for the case of count-distinct that does not have an efficient representation as a commutative semiring. We then investigate the ability to include the aggregation (or annotation) in the lexicographic order. Among the hardness results, standing out as tractable is the case of a semiring with an idempotent addition, such as those of min and max. The algorithm for this case relies on the fact that the annotated database used to compute the aggregation has non-trivial annotations only in one relation. This talk is based on a paper that will be presented in ICDT 2024 (https://arxiv.org/pdf/2303.05327.pdf).

Attachment

Video Recording