4.6 Article

Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes

期刊

PLOS ONE
卷 4, 期 9, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0006978

关键词

-

向作者/读者索取更多资源

The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10)-transformed protein-coding gene number (Y') versus log(10)-transformed genome size (X', genome size in kbp) were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p<0.001, R-2>.0.91). Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1%) compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%). The eukaryotic regression models project that the smallest dinoflagellate genome (3 x 10(6) kbp) contains 38,188 protein-coding (40,086 total) genes and the largest (245 x 10(6) kbp) 87,688 protein-coding (92,013 total) genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据