4.7 Article

Detecting alternatively spliced transcript isoforms from single-molecule long-read sequences without a reference genome

Journal

MOLECULAR ECOLOGY RESOURCES
Volume 17, Issue 6, Pages 1243-1256

Publisher

WILEY
DOI: 10.1111/1755-0998.12670

Keywords

alternative splicing; Amborella trichopoda; pipeline; single-molecule long-read sequencing; transcriptomics

Funding

  1. NSF [IOS-0922742]
  2. University of Florida Department of Biology
  3. University of Florida Genetics Institute
  4. China Scholarship Council (CSC)
  5. Direct For Biological Sciences
  6. Division Of Integrative Organismal Systems [0922742] Funding Source: National Science Foundation

Ask authors/readers for more resources

Alternative splicing (AS) is a major source of transcript and proteome diversity, but examining AS in species without well-annotated reference genomes remains difficult. Research on both human and mouse has demonstrated the advantages of using Iso-Seq data for isoform-level transcriptome analysis, including the study of AS and gene fusion. We applied Iso-Seq to investigate AS in Amborella trichopoda, a phylogenetically pivotal species that is sister to all other living angiosperms. Our data show that, compared with RNA-Seq data, the Iso-Seq platform provides better recovery on large transcripts, new gene locus identification and gene model correction. Reference-based AS detection with Iso-Seq data identifies AS within a higher fraction of multi-exonic genes than observed for published RNA-Seq analysis (45.8% vs. 37.5%). These data demonstrate that the Iso-Seq approach is useful for detecting AS events. Using the Iso-Seq-defined transcript collection in Amborella as a reference, we further describe a pipeline for detection of AS isoforms from PacBio Iso-Seq without using a reference sequence (de novo). Results using this pipeline show a 66%-76% overall success rate in identifying AS events. This de novoAS detection pipeline provides a method to accurately characterize and identify bona fide alternatively spliced transcripts in any nonmodel system that lacks a reference genome sequence. Hence, our pipeline has huge potential applications and benefits to the broader biology community.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available