期刊
NATURE COMMUNICATIONS
卷 5, 期 -, 页码 -出版社
NATURE PORTFOLIO
DOI: 10.1038/ncomms5698
关键词
-
资金
- US National Institutes of Health [1R01MH090941-01, R01MH101814]
- Ministerio de Educacion y Ciencia (Spain) [BIO2011-26205, CSD2007-00050]
- European Research Council [ERC_294653]
Identification of genetic variants affecting splicing in RNA sequencing population studies is still in its infancy. Splicing phenotype is more complex than gene expression and ought to be treated as a multivariate phenotype to be recapitulated completely. Here we represent the splicing pattern of a gene as the distribution of the relative abundances of a gene's alternative transcript isoforms. We develop a statistical framework that uses a distance-based approach to compute the variability of splicing ratios across observations, and a non-parametric analogue to multivariate analysis of variance. We implement this approach in the R package sQTLseekeR and use it to analyze RNA-Seq data from the Geuvadis project in 465 individuals. We identify hundreds of single nucleotide polymorphisms (SNPs) as splicing QTLs (sQTLs), including some falling in genome-wide association study SNPs. By developing the appropriate metrics, we show that sQTLseekeR compares favorably with existing methods that rely on univariate approaches, predicting variants that behave as expected from mutations affecting splicing.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据