4.7 Article Proceedings Paper

PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions

Journal

BIOINFORMATICS
Volume 27, Issue 13, Pages I275-I282

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btr209

Keywords

-

Funding

  1. NHGRI NIH HHS [U54 HG004555-01, U54 HG004555] Funding Source: Medline
  2. Direct For Biological Sciences
  3. Div Of Biological Infrastructure [0644282] Funding Source: National Science Foundation

Ask authors/readers for more resources

Motivation: As high-throughput transcriptome sequencing provides evidence for novel transcripts in many species, there is a renewed need for accurate methods to classify small genomic regions as protein coding or non-coding. We present PhyloCSF, a novel comparative genomics method that analyzes a multispecies nucleotide sequence alignment to determine whether it is likely to represent a conserved protein-coding region, based on a formal statistical comparison of phylogenetic codon models. Results: We show that PhyloCSF's classification performance in 12-species Drosophila genome alignments exceeds all other methods we compared in a previous study. We anticipate that this method will be widely applicable as the transcriptomes of many additional species, tissues and subcellular compartments are sequenced, particularly in the context of ENCODE and modENCODE, and as interest grows in long non-coding RNAs, often initially recognized by their lack of protein coding potential rather than conserved RNA secondary structures.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available