4.7 Article

Direct mapping and alignment of protein sequences onto genomic sequence

期刊

BIOINFORMATICS
卷 24, 期 21, 页码 2438-2444

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btn460

关键词

-

资金

  1. Ministry of Education, Culture, Sports, Science and Technology of Japan [18017017, 20017018]
  2. Grants-in-Aid for Scientific Research [18017017, 20017018] Funding Source: KAKEN

向作者/读者索取更多资源

Motivation: Finding protein-coding genes in a newly determined genomic sequence is the first step toward understanding the content written in the genome. Sequences of transcripts of homologous genes, if available, can considerably improve accuracy of prediction of genes and their structures, compared with that without such knowledge. As protein sequences are generally better conserved than nucleotide sequences, remote homologs can be used as templates, extending the applicability of evidence-based gene recognition methods. However, no tool seems to have been developed so far to simultaneously map and align a number of protein sequences on mammalian-sized genomic sequence. Results: We have extended our computer program Spaln to accept protein sequences, as well as cDNA sequences, as queries. When the query and the target sequences are reasonably similar, e.g. between mammalian orthologs, Spaln runs one to two orders of magnitude faster than conventional approaches that rely on Blast search followed by dynamic-programming-based spliced alignment. Exon-level and gene-level accuracies of Spaln are significantly higher than those obtained by the best available methods of the same type, particularly when the query and the target are distantly related.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据