Abstract

Each cell type in a solid tissue has a characteristic transcriptome and spatial arrangement, both of which are observable using modern spatial omics assays. Surprisingly however, spatial information is frequently ignored when clustering cells to identify cell types and states. In fact, spatial location is typically considered only when solving the related, but distinct, problem of demarcating tissue domains (which could include multiple cell types). We present BANKSY, an algorithm that unifies cell type clustering and domain segmentation by constructing a product space of cell and neighborhood transcriptomes, representing cell state and microenvironment, respectively. BANKSY's spatial kernel-based feature augmentation strategy improves performance on both tasks when tested on diverse FISH- and sequencing-based spatial omics datasets. Uniquely, BANKSY identified hitherto undetected niche-dependent cell states in mouse brain. We also show that quality control of spatial omics data can be formulated as a domain identification problem and solved using BANKSY. Lastly, BANKSY is orders of magnitude faster and more scalable than existing spatial clustering methods, and thus capable of processing the large datasets generated by emerging spatial technologies. In summary, BANKSY represents an accurate, biologically motivated, scalable, and versatile framework for analyzing spatial omics data.

Video Recording