Abstract

Data-driven approaches to molecular classification of cancer patients for diagnosis, prognosis or drug response prediction is often challenging due to the high dimensionality of omics data, resulting in suboptimal performance in prediction and difficulty to identify robust biomarkers. A possible strategy to overcome this issue is to replace the input omics data by simpler representations more amenable to statistical learning. In this talk I will discuss two recent attempts to represent high-dimensional omics profiles by simpler, rank-based representations: one based on full-quantile normalization, where the target distribution is optimized to solve the learning problem, and one based on all pairwise comparisons, which leads to efficient learning with kernel methods. This is joint work with Marina Le Morvan and Yunlong Jiao.

Video Recording