☆ 4.6 Article Proceedings Paper

Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis

BMC BIOINFORMATICS (2019)

期刊

BMC BIOINFORMATICS

卷 20, 期 1, 页码 -

出版社

BMC

DOI: 10.1186/s12859-019-3179-5

关键词

Autoencoder; Cluster ensemble; Single cells; scRNA-seq; Single-cell transcriptome; Cell type identification

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Mathematical & Computational Biology

资金

National Health and Medical Research Council (NHMRC) [1173469]
Australian Research Council [DP170100654, DE170100759]
National Health and Medical Research Council (NHMRC)/Career Development Fellowship [1105271]
Australian Government
Judith and David Coffey Life Lab Gift scholarship
National Health and Medical Research Council of Australia [1173469] Funding Source: NHMRC
Australian Research Council [DE170100759] Funding Source: Australian Research Council

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Background: Single-cell RNA-sequencing (scRNA-seq) is a transformative technology, allowing global transcriptomes of individual cells to be profiled with high accuracy. An essential task in scRNA-seq data analysis is the identification of cell types from complex samples or tissues profiled in an experiment. To this end, clustering has become a key computational technique for grouping cells based on their transcriptome profiles, enabling subsequent cell type identification from each cluster of cells. Due to the high feature-dimensionality of the transcriptome (i.e. the large number of measured genes in each cell) and because only a small fraction of genes are cell type-specific and therefore informative for generating cell type-specific clusters, clustering directly on the original feature/gene dimension may lead to uninformative clusters and hinder correct cell type identification. Results: Here, we propose an autoencoder-based cluster ensemble framework in which we first take random subspace projections from the data, then compress each random projection to a low-dimensional space using an autoencoder artificial neural network, and finally apply ensemble clustering across all encoded datasets to generate clusters of cells. We employ four evaluation metrics to benchmark clustering performance and our experiments demonstrate that the proposed autoencoder-based cluster ensemble can lead to substantially improved cell type-specific clusters when applied with both the standard k-means clustering algorithm and a state-of-the-art kernel-based clustering algorithm (SIMLR) designed specifically for scRNA-seq data. Compared to directly using these clustering algorithms on the original datasets, the performance improvement in some cases is up to 100%, depending on the evaluation metric used. Conclusions: Our results suggest that the proposed framework can facilitate more accurate cell type identification as well as other downstream analyses. The code for creating the proposed autoencoder-based cluster ensemble framework is freely available from https://github.com/gedcom/scCCESS

Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis

期刊

BMC BIOINFORMATICS

出版社

BMC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文