4.7 Article

Association analysis of the general environmental conditions and prokaryotes' gene distributions in various functional groups

期刊

GENOMICS
卷 96, 期 1, 页码 27-38

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ygeno.2010.03.007

关键词

gene distribution; environmental conditions; statistical association analysis; COG; KEGG; Orthology

向作者/读者索取更多资源

The activities of prokaryotes are pivotal in shaping the environment, and at the same time are greatly influenced by the environment. By using the genomic data and environmental descriptions of the complete prokaryotic genomes in NCBI's Microbial Genome Project Database and applying statistical methods, we have identified in a systematic manner those gene groups whose presence/frequency patterns are different for organisms of different environmental conditions. Here environmental conditions are characterized in four dimensions - salinity, oxygen requirement, habitat and temperature, and are based on the controlled vocabularies that NCBEs Microbial Genome Project database uses to specify the organism information: and, gene groups are determined as Clusters of Orthologous Groups (COG) and KEGG Orthology (KO) groups. These identified COG and KO groups are considered as potentially correlated with certain environmental conditions, and are then mapped to the COG general categories and KEGG pathways to determine which part of the functional machinery of prokaryotic cells are correlated with the environments. The observations derived from the analysis of the COG and KO groups that are potentially correlated with the oxygen requirement and habitat conditions are in general consistent with existing studies on properties of organisms living in different conditions of these two environmental factors. To further assess the identified correlation relationships, we have also examined whether the environmental conditions are predictable based on the gene distributions in the selected COG and KO groups. The misclassification rates of the prediction experiments are much smaller than that rendered by random guessing, indicating the existence of the correlation relationships between organisms' environmental conditions and gene distributions in certain functional groups. However, the rather moderate misclassification rates (the 25- and 75-percentiles of the misclassification rates of all prediction experiments are 16.79% and 24.06%, respectively) also indicate that the correlation relationships between environmental conditions and gene distributions in certain functional groups are not strong enough for one to decisively define the other. (C) 2010 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据