4.7 Article

Gene prediction and verification in a compact genome with numerous small introns

Journal

GENOME RESEARCH
Volume 14, Issue 11, Pages 2330-2335

Publisher

COLD SPRING HARBOR LAB PRESS, PUBLICATIONS DEPT
DOI: 10.1101/gr.2816704

Keywords

-

Funding

  1. NHGRI NIH HHS [F33 HG002635, T32 HG000045, K22 HG000045, F33 HG002653, T32 HG00045] Funding Source: Medline
  2. NIAID NIH HHS [R01-AI051209, R01 AI051209, R01 AI050184, R01-AI50184, R01-AI49173] Funding Source: Medline
  3. NIGMS NIH HHS [R01-GM66303, R01 GM066303] Funding Source: Medline

Ask authors/readers for more resources

The genomes of clusters of related eukaryotes are now being sequenced at an increasing rate, creating a need for accurate, low-cost annotation of exon-intron structures. In this paper, we demonstrate that reverse transcription-polymerase chain reaction (RT-PCR) and direct sequencing based on predicted gene structures satisfy this need, at least for single-celled eukaryotes. The TWINSCAN gene prediction algorithm was adapted for the fungal pathogen Cryptococcus neoformans by using a precise model of intron lengths in combination with ungapped alignments between the genome sequences of the two closely related Cryptococcus varieties. This approach resulted in similar to60% of known genes being predicted exactly right at every coding base and splice site. When previously unannotated TWINSCAN predictions were tested by RT-PCR and direct sequencing, 75% of targets spanning two predicted introns were amplified and produced high-quality sequence. When targets spanning the complete predicted open reading frame were tested, 72% of them amplified and produced high-quality sequence. We conclude that sequencing a small number of expressed sequence tags (ESTs) to provide training data, running TWINSCAN on an entire genome, and then performing RT-PCR and direct sequencing on all of its predictions would be a cost-effective method for obtaining an experimentally verified genome annotation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available