Cancer is a disease of evolution whose process is characterized by the accumulation of somatic alterations to the genome, which selectively make a cancer cell fitter to survive. The understanding of progression models for cancer, i.e., the identification of sequences of mutations that leads to the emergence of the disease, is still unclear. The problem of reconstructing such progression models is not new; in fact several methods to extract progression models from cross-sectional samples have been developed since the late 90s.
In the past two years and a half, we have proposed two novel algorithms called CAPRESE (CAncer PRogression Extraction with Single Edges) and CAPRI (CAncer PRogression Inference) to reconstruct models of the sequences of mutations accumulation, which characterize cancer evolution. To the best of our knowledge, the existing techniques are based either on correlation or on maximum likelihood. Differently, we perform the reconstruction by exploiting the notion of probabilistic causation in the spirit of Suppes’ causality theory. We note that in the context of biological systems and cancer progression, the notion of causality can be interpreted as the notion of "selective advantage" of the occurrence of a mutation.
In this setting, we prove the correctness of our algorithms and characterize their performance. Finally we discuss how our R BioConductor package TRanslational ONCOlogy (TRONCO) is being used on real cancer datasets - e.g. Atypical Myeloid Chronic Leukemia (aCML), Colorectal Cancer (CRC), et al. - and how it highlights possibly biologically significant patterns in the progressions inferred.