4.7 Article

Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data

期刊

BIOINFORMATICS
卷 25, 期 24, 页码 3207-3212

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btp579

关键词

-

资金

  1. Howard Hughes Medical Institute Funding Source: Medline
  2. NIGMS NIH HHS [T 532 GM007197-34, GM077959, R01 GM077959, T32 GM007197] Funding Source: Medline
  3. NIMH NIH HHS [R01 MH084703-01, R01 MH084703] Funding Source: Medline

向作者/读者索取更多资源

Motivation: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here, we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE). Results: We generated 16 million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When we mapped these reads to the human genome we found that, at heterozygous SNPs, there was a significant bias toward higher mapping rates of the allele in the reference sequence, compared with the alternative allele. Masking known SNP positions in the genome sequence eliminated the reference bias but, surprisingly, did not lead to more reliable results overall. We find that even after masking, similar to 5-10% of SNPs still have an inherent bias toward more effective mapping of one allele. Filtering out inherently biased SNPs removes 40% of the top signals of ASE. The remaining SNPs showing ASE are enriched in genes previously known to harbor cis-regulatory variation or known to show uniparental imprinting. Our results have implications for a variety of applications involving detection of alternate alleles from short-read sequence data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据