4.5 Article

Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly

Journal

GENOME BIOLOGY
Volume 22, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s13059-020-02244-4

Keywords

-

Ask authors/readers for more resources

Ratatosk is a method presented to correct long reads with short read data, reducing the error rate of long reads 6-fold on average and significantly improving the accuracy of SNP and indel calls. An assembly of Ratatosk corrected reads from an individual showed better contig N50 and less misassemblies compared to a PacBio HiFi reads assembly.
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available