Journal
GENOME BIOLOGY
Volume 22, Issue 1, Pages -Publisher
BMC
DOI: 10.1186/s13059-021-02296-0
Keywords
Splicing; Long-read sequencing; Spliced alignment; RNA-seq; Gene expression; Transcriptome assembly; Machine learning; Nanopore sequencing
Funding
- University of Dundee Global Challenges Research Fund
- H2020 Marie Sklodowska-Curie Actions [799300]
- BBSRC [BB/M010066/1, BB/J00247X/1, BB/M004155/1]
- BBSRC [BB/M010066/1, BB/M004155/1, BB/J00247X/1] Funding Source: UKRI
- Marie Curie Actions (MSCA) [799300] Funding Source: Marie Curie Actions (MSCA)
Ask authors/readers for more resources
This study introduces a method to improve the accuracy of long-read RNA sequencing using alignment metrics and machine-learning-derived sequence information, effectively filtering out spurious splice junctions and guiding realignment in a two-pass approach. This method enhances the accuracy of spliced alignment and transcriptome assembly for species with and without high-quality annotations.
Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools (https://github.com/bartongroup/2passtools), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available