4.8 Article

Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning

Journal

NUCLEIC ACIDS RESEARCH
Volume 50, Issue 21, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkac788

Keywords

-

Funding

  1. National Institutes of Health [R01GM120609, R03HL147197]
  2. Simons Foundation [SIMONS606450]

Ask authors/readers for more resources

Exome sequencing is widely used in genetic studies and clinical diagnosis, but the data is noisy and existing methods can't achieve high precision and recall rates simultaneously. To address this, researchers developed a transfer learning method called CNV-espresso which encodes candidate CNVs as images and uses pretrained convolutional neural networks to classify copy number states. CNV-espresso outperforms manual inspection in large-scale exome sequencing studies.
Exome sequencing is widely used in genetic studies of human diseases and clinical genetic diagnosis. Accurate detection of copy number variants (CNVs) is important to fully utilize exome sequencing data. However, exome data are noisy. None of the existing methods alone can achieve both high precision and recall rate. A common practice is to perform heuristic filtration followed by manual inspection of read depth of putative CNVs. This approach does not scale in large studies. To address this issue, we developed a transfer learning method, CNV-espresso, for in silico confirming rare CNVs from exome sequencing data. CNV-espresso encodes candidate CNVs from exome data as images and uses pretrained convolutional neural network models to classify copy number states. We trained CNV-espresso using an offspring-parents trio exome sequencing dataset, with inherited CNVs as positives and CNVs with Mendelian errors as negatives. We evaluated the performance using additional samples that have both exome and whole-genome sequencing (WGS) data. Assuming the CNVs detected from WGS data as a proxy of ground truth, CNV-espresso significantly improves precision while keeping recall almost intact, especially for CNVs that span a small number of exons. CNV-espresso can effectively replace manual inspection of CNVs in large-scale exome sequencing studies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available