Abstract

In this talk we'll start with a high-level overview of the GPU architecture and programming model, and its applicability to several core problems in bioinformatics. We will then present a novel data-parallel algorithm for de Bruijn graph construction on the GPU in the context of short read micro-assembly, which is a critical step of many variant discovery pipelines. Here we specifically focus on the popular Genome Analysis Toolkit (GATK) HaplotypeCaller implementation. Since typically the micro-assembly procedure is executed independently on many separate windows of the genome, a coarse-grained parallel implementation can be easily achieved by executing the original (highly) sequential algorithm on each window in parallel. However, this approach suffers from poor load balancing and is not very well suited for the GPU. On the other hand, our algorithm is a fine-grained parallel implementation of the de Bruijn graph construction across any given number of genome windows.
It relies on data-parallel primitives, such as (segmented) sort, scan, filter, and reduction by key to efficiently populate the graph data structure in the compressed sparse row (CSR) format.