4.7 Article

Multifractal and correlation analyses of protein sequences from complete genomes

期刊

PHYSICAL REVIEW E
卷 68, 期 2, 页码 -

出版社

AMER PHYSICAL SOC
DOI: 10.1103/PhysRevE.68.021913

关键词

-

向作者/读者索取更多资源

A measure representation of protein sequences similar to the measure representation of DNA sequences proposed in our previous paper [Yu , Phys. Rev. E 64, 031903 (2001)] and another induced measure are introduced. Multifractal analysis is then performed on these two kinds of measures of a large number of protein sequences derived from corresponding complete genomes. From the values of the D-q (generalized dimensions) spectra and related C-q (analogous specific heat) curves, it is concluded that these protein sequences are not completely random sequences. For substrings with length K=5, the D-q spectra of all organisms studied are multifractal-like and sufficiently smooth for the C-q curves to be meaningful. The C-q curves of all bacteria resemble a classical phase transition at a critical point. But the analogous phase transitions of higher organisms studied exhibit the shape of double-peaked specific heat function. But for the classification problem, the multifractal property is not sufficient. When the measure representations of protein sequences from complete genomes are considered as time series, a method based on correlation analysis after removing some memory from the time series is proposed to construct a phylogenetic tree. This construction is shown to be reasonably satisfactory.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据