4.7 Article

APPRIS principal isoforms and MANE Select transcripts define reference splice variants

Journal

BIOINFORMATICS
Volume 38, Issue -, Pages ii89-ii94

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btac473

Keywords

-

Funding

  1. ECCB2022
  2. National Human Genome Research Institute of the National Institutes of Health [U24HG007234]
  3. Ministry of Science, Innovation and Universities [PGC2018-097019-B-I00]
  4. Carlos III Institute of Health-Fondo de Investigacion Sanitaria [IPT17/0019]
  5. 'la Caixa' Foundation [HR17-00247]

Ask authors/readers for more resources

Selecting the splice variant that best represents a coding gene is crucial for experimental analyses and mapping clinically relevant variants. This study compares different methods and finds that APPRIS principal isoforms and MANE Select transcripts are the best choices for selecting the main splice variant.
Motivation: Selecting the splice variant that best represents a coding gene is a crucial first step in many experimental analyses, and vital for mapping clinically relevant variants. This study compares the longest isoforms, MANE Select transcripts, APPRIS principal isoforms, and expression data, and aims to determine which method is best for selecting biological important reference splice variants for large-scale analyses. Results: Proteomics analyses and human genetic variation data suggest that most coding genes have a single main protein isoform. We show that APPRIS principal isoforms and MANE Select transcripts best describe these main cellular isoforms, and find that using the longest splice variant as the representative is a poor strategy. Exons unique to the longest splice isoforms are not under selective pressure, and so are unlikely to be functionally relevant. Expression data are also a poor means of selecting the main splice variant. APPRIS principal and MANE Select exons are under purifying selection, while exons specific to alternative transcripts are not. There are MANE and APPRIS representatives for almost 95% of genes, and where they agree they are particularly effective, coinciding with the main proteomics isoform for over 98.2% of genes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available