4.6 Article

Intronic CNVs and gene expression variation in human populations

期刊

PLOS GENETICS
卷 15, 期 1, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pgen.1007902

关键词

-

资金

  1. Spanish Ministry of Economy, Industry and Competitiveness (MEIC) [BFU2015-71241-R]
  2. Wellcome Trust [206103/Z/17/Z]
  3. European Regional Development Fund (ERDF)

向作者/读者索取更多资源

Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations. Author summary Most human genes have introns that have to be removed after a gene is transcribed from DNA to RNA because they not encode information to translate RNA into proteins. As mutations in introns do not affect protein sequences, they are usually ignored when looking for normal or pathogenic genomic variation. However, introns comprise about half of the human non-coding genome and they can have important regulatory roles. We show that deletions of intronic regions appear more frequent than previously expected in the healthy population, with a significant proportion of genes with evolutionary ancient and essential functions carrying them. This finding was very surprising, as ancient genes tend to have high conservation of their coding sequence. However, we show that deletions of their non-coding intronic sequence can produce considerable changes in their locus length. We found that a significant number of these intronic deletions are associated with under- or over-expression of the affected genes or distant genes interacting in 3D. Our data suggests that the frequent gene length variation in protein-coding genes resulting from intronic CNVs might influence their regulation in different individuals.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据