Abstract
Parallel algorithms are crucial for processing large-scale datasets and solving complex problems efficiently.
As multi-core processors have become the norm, parallelism offers the potential to significantly accelerate computations in areas such as data science, social network analysis, and computational biology. However, achieving high performance and scalability in parallel computing remains a challenging task, especially when dealing with graphs of diverse structures. The inherent complexity of parallel graph algorithms demands innovative approaches to harness the full potential of modern hardware.
This thesis argues that shared-memory parallel algorithms and techniques can solve a wide array of fundamental graph problems and applications with proven efficiency and strong scalability to larger data sizes and increased parallel resources. We observe that many existing parallel graph solutions can even perform worse than optimal sequential algorithms on readily available parallel hardware and frequently encounter out-of-memory issues when processing large graphs due to excessive memory overhead.
To address these limitations, we identify and prioritize three critical properties for designing scalable and efficient parallel algorithms: synchronization-efficiency, space-efficiency, and work-efficiency. Our research philosophy, algorithm-system co-design, guides the development of solutions that achieve these goals. The thesis investigates various problems, including fundamental graph algorithms and data science applications. Most of our proposed algorithms are implemented and rigorously evaluated on real-world, large-scale datasets, some with billions of edges. Our experimental results consistently demonstrate superior practical performance and robust scalability with an increasing number of processor cores.