Abstract

Cancer is a disease driven from somatic mutations appearing in an individual’s genome. One of the main challenges in large cancer studies is to identify the handful of driver mutations responsible for cancer progression and development among the hundreds or thousands of mutations present in a tumor genome. Recent approaches have shown that to identify driver mutations it is important to consider mutations in the context of interaction networks. A previously proposed formulation requires to find connected subgraphs of a large protein-protein interaction network that are mutated in a large number of patients, with a subgraph being mutated in a patient if at least one of its genes is mutated in the patient.  We study ILP formulations for the exact solution of the combinatorial problem of finding subnetworks containing mutations in a large fraction of cancer patients. We show that a branch and cut algorithm can solve such problem even faster than previously proposed greedy and approximation algorithms. We tested our algorithm on cancer data and show that our approach is viable and allows for the identification of subnetworks containing previously reported cancer genes.