Recent whole-exome sequencing studies have identified recurrent somatic mutations in splicing factor genes across multiple cancer types, supporting the need to globally characterize splicing alterations across human cancers. Through the integration of mRNA, whole-exome sequencing, and whole-genome sequencing data from The Cancer Genome Atlas and International Cancer Genome Consortium, we are identifying RNA splicing alterations across ~10,000 cancer transcriptomes and investigating the underlying somatic mutations that cause these splicing alterations. We have further developed a computational pipeline called JuncBASE to identify and quantify alternative splicing in RNA-Seq data, which incorporates unannotated splicing events in the analysis.
In initial studies, we have identified altered splicing events significantly associated with mutations in the splicing factors U2AF1 and SF3B1 and found that these mutations cause altered recognition of 3’ splice site sequences. Current work aims to associate transcriptome changes with somatic mutations at splice sites and proximal intronic regions. As a result, we have identified somatic mutations associated with expression of oncogenic isoforms of MET and ERBB2. Our work highlights the importance of including novel splicing events in cancer transcriptome analysis as aberrant transcripts can be expressed due to somatic mutations.