4.7 Article

Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing

Journal

BIOINFORMATICS
Volume 30, Issue 6, Pages 815-822

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btt647

Keywords

-

Funding

  1. Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) [221S0002, 22129008]
  2. MEXT
  3. Global COE program
  4. Grants-in-Aid for Scientific Research [25461270, 22129001, 221S0002, 22129008, 22129002] Funding Source: KAKEN

Ask authors/readers for more resources

Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2-6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100bp, the typical length of short reads. Results: We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT TM sequencing (Pacific Biosciences), determined 2.3-3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available