4.5 Review

Using Bayesian networks to discover relations between genes, environment, and disease

Journal

BIODATA MINING
Volume 6, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/1756-0381-6-6

Keywords

Structural learning; Belief networks; Genetic epidemiology; Bioinformatics; Complex traits; Arsenic; SNP

Funding

  1. National Center for Research Resources [5P20RR024474-02]
  2. National Institute of General Medical Sciences from the National Institutes of Health [8 P20 GM103534-02]
  3. NATIONAL CANCER INSTITUTE [R01CA057494, K07CA102327, R03CA121382, R03CA099500] Funding Source: NIH RePORTER
  4. NATIONAL CENTER FOR RESEARCH RESOURCES [P20RR024475] Funding Source: NIH RePORTER
  5. NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES [P20GM103534] Funding Source: NIH RePORTER

Ask authors/readers for more resources

We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire, USA. For didactical purposes, we intentionally keep this example simple. When applied to complete data records, we find only minor differences in the performance and results of different algorithms. Subsequent incorporation of partial records through application of the EM algorithm gives us greater power to detect relations. Allowing for network structures that depart from a strict causal interpretation also enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available