4.5 Article

2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing

Journal

GENOME BIOLOGY
Volume 22, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s13059-021-02296-0

Keywords

Splicing; Long-read sequencing; Spliced alignment; RNA-seq; Gene expression; Transcriptome assembly; Machine learning; Nanopore sequencing

Funding

  1. University of Dundee Global Challenges Research Fund
  2. H2020 Marie Sklodowska-Curie Actions [799300]
  3. BBSRC [BB/M010066/1, BB/J00247X/1, BB/M004155/1]
  4. BBSRC [BB/M010066/1, BB/M004155/1, BB/J00247X/1] Funding Source: UKRI
  5. Marie Curie Actions (MSCA) [799300] Funding Source: Marie Curie Actions (MSCA)

Ask authors/readers for more resources

This study introduces a method to improve the accuracy of long-read RNA sequencing using alignment metrics and machine-learning-derived sequence information, effectively filtering out spurious splice junctions and guiding realignment in a two-pass approach. This method enhances the accuracy of spliced alignment and transcriptome assembly for species with and without high-quality annotations.
Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools (https://github.com/bartongroup/2passtools), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available