Abstract

Big Data analytical techniques and AI have the potential to transform drug discovery, as they are reshaping other areas of science and technology, but we need to blend biology and chemistry in a format that is amenable for modern machine learning. In this talk, I will present the Chemical Checker (CC), a resource that provides processed, harmonized and integrated bioactivity data on small molecules. The CC divides data into five levels of increasing complexity, ranging from the chemical properties of compounds to their clinical outcomes. In between, it considers targets, off-targets, perturbed biological networks and several cell-based assays such as gene expression, growth inhibition and morphological profiles. In the CC, bioactivity data are expressed in a vector format, which naturally extends the notion of chemical similarity between compounds to similarities between bioactivity signatures of different kinds. We show how CC signatures can boost the performance of drug discovery tasks that typically capitalize on chemical descriptors, including compound library optimization, target identification and anticipation of failures in clinical trials. Moreover, we demonstrate and experimentally validate that CC signatures can be used to reverse and mimic biological signatures of disease models and genetic perturbations, options that are otherwise impossible using chemical information alone. Indeed, using bioactivity signatures we have identified small molecules able to revert transcriptional signatures related to Alzheimer´s disease in vitro and in vivo, as well as compounds against Snail1, a transcription factor with an essential role in the epithelial-to-mesenchymal transition, showing that our approach might offer a new perspective to find small molecules able to modulate the activity of undruggable proteins.

Video Recording