Journal
INTERVIROLOGY
Volume 53, Issue 5, Pages 310-320Publisher
KARGER
DOI: 10.1159/000312916
Keywords
Genome evolution; Giant virus; Marseillevirus; Mimivirus; Nucleocytoplasmic large DNA virus; ORFan
Categories
Funding
- Centre National de la Recherche Scientifique (CNRS)
Ask authors/readers for more resources
Objective: An important proportion of coding sequences in genomes, notably in viruses, do not match any sequences in databases and are assigned as ORFan sequences. Nucleocytoplasmic large DNA viruses (NCLDVs) harbor great numbers of ORFs with a high number consisting of ORFans. Thus, we decided to decipher the nature of ORFans in the NCLDVs. Methods: A genome-wide study was carried out to estimate the ORFan proportion in NCLDV genomes and to analyze their general features compared with non-ORFan. Results: The ORFan percentages comprised between 2.8 and 75.2% of the ORF content according to the virus lineage. We propose to classify ORFans in four categories according to their possible match with metagenomic sequences and their prevalence at different taxonomic ranks. Our results indicate that NCLDV ORFans have overall similar features with non-ORFans, except they are shorter. Conclusions: An ORFan classification scheme was proposed to decipher their origin and evolution. Most ORFans were likely labeled ORFan owing to the gap of knowledge of the sequence space. ORFans might be true functional genes with likely the same expression potential as non-ORFan genes. Part of them may also correspond to new genes formed de novo through the diverse mechanisms of gene evolution. Copyright (C) 2010 S. Karger AG, Basel
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available