Abstract

Cancer progression is an evolutionary process characterized by the accumulation of mutations and responsible for tumor growth, clinical progression, and drug resistance development. We discuss how to reconstruct the evolutionary history of a tumor from single-cell sequencing data. The tumor phylogeny problem is challenging because of sequencing errors and the high rate of allelic drop-out in single cell whole-exome sequencing experiments. We present a probabilistic model and a Markov Chain Monte Carlo approach to learn tumor phylogenies from such data. We use the model to develop a statistical test of the infinite sites assumption, which is frequently made in cancer evolution. We find that the infinite sites assumption is often violated by back mutations and sometimes also by parallel mutations.