4.7 Article

Distinguishing Proteins From Arbitrary Amino Acid Sequences

期刊

SCIENTIFIC REPORTS
卷 5, 期 -, 页码 -

出版社

NATURE PUBLISHING GROUP
DOI: 10.1038/srep07972

关键词

-

资金

  1. USA Natural Science Foundation [DMS-1120824]
  2. National Natural Sciences Foundation of China [31271408]
  3. Tsinghua University start up fund
  4. Tsinghua University independent research project grant

向作者/读者索取更多资源

What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据