Abstract

Rapid advancement in high throughput sequencing (HTS) and mass spectrometry (MS) technologies has enabled the acquisition of the genomic,  transcriptomic and proteomic data from the same tissue sample. We recently developed a computational framework which, for the first time, can integratively analyze all three types of omics data to obtain a complete molecular profile of same tissue in normal vs disease conditions. Our framework includes a  computational method to identify micro structural variants (microSVs) by jointly analyzing matching whole genome sequencing (WGS) and RNA-Seq data. Our framework, coupled with deFuse, our gene fusion detection method, can provide an accurate profile of structurally aberrant transcripts, commonly observed in tumor samples. Given the genomic breakpoints, our framework can then identify all relevant peptides that span the breakpoint junctions and match them with unique proteomic signatures in the respective proteomics data sets. 
 
When used together with CITUP and CTP-single, our WGS based clonal composition inference methods, our computational framework can help identify clone-specific expressed structural alterations in a given tumor sample. 
We perform further systemic analysis of such expressed variants with HIT'nDRIVE, our combinatorial method to identify (structurally) aberrant genes that can collectively influence possibly distant ``outlier'' genes based on what we call the ``random-walk facility location'' (RWFL) problem on a protein or gene interaction network. These influential structurally altered genes have been shown to play prominent roles in tumor evolution as potential drivers of the observed cancer phenotype.