☆ 4.8 Article

Identification of protein coding regions in RNA transcripts

NUCLEIC ACIDS RESEARCH (2015)

期刊

NUCLEIC ACIDS RESEARCH

卷 43, 期 12, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/nar/gkv227

关键词

类别

Biochemistry & Molecular Biology

资金

National Institutes of Health (NIH) [HG000783]
Oxford University Press

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Massive parallel sequencing of RNA transcripts by next-generation technology (RNA-Seq) generates critically important data for eukaryotic gene discovery. Gene finding in transcripts can be done by statistical (alignment-free) as well as by alignment-based methods. We describe a new tool, GeneMarkS-T, for ab initio identification of protein-coding regions in RNA transcripts. The algorithm parameters are estimated by unsupervised training which makes unnecessary manually curated preparation of training sets. We demonstrate that (i) the unsupervised training is robust with respect to the presence of transcripts assembly errors and (ii) the accuracy of GeneMarkS-T in identifying protein-coding regions and, particularly, in predicting translation initiation sites in modelled as well as in assembled transcripts compares favourably to other existing methods.

Identification of protein coding regions in RNA transcripts

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Identification of protein coding regions in RNA transcripts

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文