4.8 Article

Systematic identification of intron retention associated variants from massive publicly available transcriptome sequencing data

Journal

NATURE COMMUNICATIONS
Volume 13, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41467-022-32887-9

Keywords

-

Funding

  1. KAKENHI [18H03327, 21H03549, 21H04828]
  2. Japan Agency for Medical Research and Development [20km0405207h9905, 20kk0205014h0005, 20ek0109485h0001, 21ck0106641h0001, 21jm0210085h0002]
  3. National Cancer Center Research and Development Funds [2020-A-7, 2021-A-3, 2020-A-2]
  4. Japan Society for the Promotion of Science (JSPS) Home-Returning Researcher Development Research [19K24691]

Ask authors/readers for more resources

This study developed a method to detect genomic variants causing splicing changes using transcriptome data alone. By applying this method to a large dataset, the researchers identified a significant number of genomic variants associated with intron retention. Additionally, by exploring the positional relationships with known disease variants, they extracted potential disease-associated variants.
Many disease-associated genomic variants disrupt gene function through abnormal splicing. With the advancement of genomic medicine, identifying disease-associated splicing associated variants has become more important than ever. Most bioinformatics approaches to detect splicing associated variants require both genome and transcriptomic data. However, there are not many datasets where both of them are available. In this study, we develop a methodology to detect genomic variants that cause splicing changes (more specifically, intron retention), using transcriptome sequencing data alone. After evaluating its sensitivity and precision, we apply it to 230,988 transcriptome sequencing data from the publicly available repository and identified 27,049 intron retention associated variants (IRAVs). In addition, by exploring positional relationships with variants registered in existing disease databases, we extract 3,000 putative disease-associated IRAVs, which range from cancer drivers to variants linked with autosomal recessive disorders. The in-silico screening framework demonstrates the possibility of near-automatically acquiring medical knowledge, making the most of massively accumulated publicly available sequencing data. Collections of IRAVs identified in this study are available through IRAVDB (https:// iravdb.io/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available