4.8 Article

Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies

Journal

NUCLEIC ACIDS RESEARCH
Volume 31, Issue 19, Pages 5654-5666

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkg770

Keywords

-

Funding

  1. NLM NIH HHS [R01-LM06845-04, R01 LM006845] Funding Source: Medline

Ask authors/readers for more resources

The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full-length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the similar to27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available