期刊
BIOINFORMATICS
卷 37, 期 16, 页码 2374-2381出版社
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab116
关键词
-
类别
资金
- National Institute of Mental Health (NIMH) [R01MH123184, R37MH057881]
A new single-cell RNA sequencing data simulator ESCO is proposed in the study, focusing on gene co-expression and evaluating the performance of imputation methods in GCN recovery. The results show that imputation generally helps GCN recovery when data are not too sparse, while simple data aggregating methods are more suitable in the presence of an excessive fraction of zero counts.
Motivation: Gene-gene co-expression networks (GCN) are of biological interest for the useful information they provide for understanding gene-gene interactions. The advent of single cell RNA-sequencing allows us to examine more subtle gene co-expression occurring within a cell type. Many imputation and denoising methods have been developed to deal with the technical challenges observed in single cell data; meanwhile, several simulators have been developed for benchmarking and assessing these methods. Most of these simulators, however, either do not incorporate gene co-expression or generate co-expression in an inconvenient manner. Results: Therefore, with the focus on gene co-expression, we propose a new simulator, ESCO, which adopts the idea of the copula to impose gene co-expression, while preserving the highlights of available simulators, which perform well for simulation of gene expression marginally. Using ESCO, we assess the performance of imputation methods on GCN recovery and find that imputation generally helps GCN recovery when the data are not too sparse, and the ensemble imputation method works best among leading methods. In contrast, imputation fails to help in the presence of an excessive fraction of zero counts, where simple data aggregating methods are a better choice. These findings are further verified with mouse and human brain cell data.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据