Abstract

DNA methylation is one of the effective indicators to measure epigenetic changes, thus can be used to predict disease onset. Early diagnose of cancers is clinically essential to successful cure. It is, however, extremely challenging because of no or lack of symptoms that people easily overlook, thus the most critical time for treatments are frequently neglected. It is fundamental to detect a tumor as early as possible to provide proper treatment at the right time. Here we propose a novel method to select the most predictive methylation biomarkers by analyzing DNA methylation pattern in three specimen; tumor and normal cells of patients and whole blood from healthy people as background noise for liquid biopsy.
For this purpose, we developed MethylMarker to process bisulfite sequencing reads, identify differentially methylated region across samples in single CpG resolution, extend a single CpG to multiple CpGs to meet assay development criteria, screen out assay candidates that are likely to alarm false-positive errors by factoring in methylation pattern from whole blood and run supervised and unsupervised machine learning algorithm to select the most predictive assay candidate CpG loci. The package also provides enhanced visualization, so one can confirm the recommended candidates through graphical based approach. As such MethylMarker provides predictive candidate CpG regions that are diverse, sensitive and robust so as to develop the
diagnostic and clinical assay. We exercised MethylMarker to tumor/normal tissues from early stage colorectal cancer patients and whole blood bisulfite sequencing data from healthy people. We identified tens of DNA methylation candidate loci, which successfully predicts tumor and
normal with ~95% accuracy.