Journal
BIOINFORMATICS
Volume 35, Issue 15, Pages 2610-2617Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty992
Keywords
-
Categories
Funding
- Genome Canada/CIHR/Genome BC [174DE]
- Natural Sciences and Engineering Research Council of Canada [RGPIN355532-10]
- National Institutes of Health (USA) [1R01GM084875]
- China Scholarship Council [201206110038]
- BC Children's Hospital Foundation
- Genome Canada Bioinformatics and Computational Biology [255ONT]
- Canadian Institutes of Health Research (CIHR): Bioinformatics and Computational Biology [BOP-149430]
Ask authors/readers for more resources
Motivation Deciphering the functional roles of cis-regulatory variants is a critical challenge in genome analysis and interpretation. It has been hypothesized that altered transcription factor (TF) binding events are a central mechanism by which cis-regulatory variants impact gene expression levels. However, we lack a computational framework to understand and quantify such mechanistic contributions. Results We present TF2Exp, a gene-based framework to predict the impact of altered TF-binding events on gene expression levels. Using data from lymphoblastoid cell lines, TF2Exp models were applied successfully to predict the expression levels of 3196 genes. Alterations within DNase I hypersensitive, CTCF-bound and tissue-specific TF-bound regions were the greatest contributing features to the models. TF2Exp models performed as well as models based on common variants, both in cross-validation and external validation. Combining TF alteration and common variant features can further improve model performance. Unlike variant-based models, TF2Exp models have the unique advantage to evaluate the functional impact of variants in linkage disequilibrium and uncommon variants. We find that adding TF-binding events altered only by uncommon variants could increase the number of predictable genes (R-2 > 0.05). Taken together, TF2Exp represents a key step towards interpreting the functional roles of cis-regulatory variants in the human genome. Availability and implementation The code and model training results are publicly available at https://github.com/wqshi/TF2Exp. Supplementary information Supplementary data are available at Bioinformatics online.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available