4.6 Article

How Many Genes Are Expressed in a Transcriptome? Estimation and Results for RNA-Seq

期刊

PLOS ONE
卷 10, 期 6, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0130262

关键词

-

资金

  1. Consejo Nacional de Ciencia y Tecnologia [165778, 34686]

向作者/读者索取更多资源

RNA-seq experiments estimate the number of genes expressed in a transcriptome as well as their relative frequencies. However, an undetermined number of genes can remain undetected due to their low expression relative to the sample size (sequence depth). Estimation of the true number of genes expressed in a transcriptome is essential in order to determine which genes are exclusively expressed in specific tissues or under particular conditions. A reliable estimate of the true number of expressed genes is also required to accurately measure transcriptome changes and to predict the sequencing depth needed to increase the proportion of detected genes. This problem is analogous to ecological sampling problems such as estimating the number of species at a given site. Here we present a non-parametric estimator for the number of undetected genes as well as for the extra sample size needed to detect a given proportion of the undetected genes. Our estimators are superior to ones already published by having smaller standard errors and biases. We applied our method to a set of 32 publicly available RNA-seq experiments, including the evaluation of 311 individually sequenced libraries. We found that in the majority of the cases more than one thousand genes are undetected, and that on average approximately 6% of the expressed genes per accession remain undetected. This figure increases to approximately 10% if individual sequencing libraries are analyzed. Our method is also applicable to metagenomic experiments. Using our method, the number of undetected genes as well as the sample size needed to detect them can be calculated, leading to more accurate and complete gene expression studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据