4.6 Article

Detection and inference of interspersed duplicated insertions from paired-end reads

Journal

DIGITAL SIGNAL PROCESSING
Volume 111, Issue -, Pages -

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.dsp.2020.102959

Keywords

Interspersed duplicated insertions; Next-generation sequencing; Paired-end reads; Insertion contents; Dynamic process

Funding

  1. Natural Science Foundation of China [61571341]

Ask authors/readers for more resources

Interspersed duplicated insertion (idINS) is a common type of genomic insertion that plays a significant role in genomic instability and cancer genesis. The novel algorithm DIPins accurately detects and infers idINS contents from paired-end reads, even when the variation exceeds the insert size. DIPins shows advantages over existing methods and has potential for accurate characterization of idINSs in the human genome.
Interspersed duplicated insertion (idINS) is a common type of genomic insertion and plays an important role in genomic instability in cancer genesis. Nevertheless, the detection of such type of insertions is challenging, since the reads originated from idINS regions in the donor sample are most likely to be mapped perfectly to other regions in the reference. Most of the existing approaches adopt paired-end mapping to detect idINSs, but the characterization of idINSs larger than the mean insert size is still challenging due to the short sequencing reads. Therefore, there is still a need for practical algorithms to detect and infer idINSs regardless of their lengths. Here, we present a new algorithm, called DIPins, which can accurately detect and infer idINSs contents from paired-end reads. DIPins is capable of detecting breakpoint positions and inferring the contents of idINSs even when the length of variation exceeds the paired-end insert size. The major principle of DIPins is that it extracts multiple signatures from split reads and integrates them to determine idINS positions and adopts a dynamic process to construct idINS contents by iteratively generating unobserved split reads from the restricted area around the idINS breakpoint. We test the performance of DIPins on both simulation and real data. The results demonstrate its advantages over other methods and its potential application prospects in the accurate characterization of idINSs in human genome. (C) 2021 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available