4.7 Article

jSRC: a flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data

期刊

BRIEFINGS IN BIOINFORMATICS
卷 22, 期 5, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbaa433

关键词

single-cell RNA-seq; joint learning; sparse representation; dynamic cell clustering

资金

  1. National Natural Science Foundation of China (NSFC) [61772394]

向作者/读者索取更多资源

The joint sparse representation and clustering (jSRC) algorithm effectively addresses the clustering issues in single-cell RNA sequencing (scRNA-seq) data, providing more accurate results across different cell datasets. By imposing sparse representation on features, it enhances the interpretability of patterns and successfully identifies dynamic cell types associated with the progression of COVID-19.
Single-cell RNA-sequencing (scRNA-seq) explores the transcriptome of genes at cell level, which sheds light on revealing the heterogeneity and dynamics of cell populations. Advances in biotechnologies make it possible to generate scRNA-seq profiles for large-scale cells, requiring effective and efficient clustering algorithms to identify cell types and informative genes. Although great efforts have been devoted to clustering of scRNA-seq, the accuracy, scalability and interpretability of available algorithms are not desirable. In this study, we solve these problems by developing a joint learning algorithm [a.k.a. joints sparse representation and clustering (jSRC)], where the dimension reduction (DR) and clustering are integrated. Specifically, DR is employed for the scalability and joint learning improves accuracy. To increase the interpretability of patterns, we assume that cells within the same type have similar expression patterns, where the sparse representation is imposed on features. We transform clustering of scRNA-seq into an optimization problem and then derive the update rules to optimize the objective of jSRC. Fifteen scRNA-seq datasets from various tissues and organisms are adopted to validate the performance of jSRC, where the number of single cells varies from 49 to 110 824. The experimental results demonstrate that jSRC significantly outperforms 12 state-of-the-art methods in terms of various measurements (on average 20.29% by improvement) with fewer running time. Furthermore, jSRC is efficient and robust across different scRNA-seq datasets from various tissues. Finally, jSRC also accurately identifies dynamic cell types associated with progression of COVID-19. The proposed model and methods provide an effective strategy to analyze scRNA-seq data (the software is coded using MATLAB and is free for academic purposes; https://github.com/xkmaxidian/jSRC).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据