4.6 Article

High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
Volume 103, Issue 484, Pages 1438-1456

Publisher

AMER STATISTICAL ASSOC
DOI: 10.1198/016214508000000869

Keywords

Biological pathways; Breast cancer genomics; Decomposing gene expression patterns; Dirichlet process factor model; Evolutionary stochastic search; Factor regression; Gene expression analysis; Gene expression profiling; Gene networks; Non-Gaussian multivariate analysis; Sparse factor models; Sparsity priors

Funding

  1. National Science Foundation [DMS-0102227, DMS-0342172] Funding Source: Medline
  2. NCI NIH HHS [U54 CA112952-01, U54 CA112952] Funding Source: Medline
  3. NHLBI NIH HHS [P01 HL073042-029002, P01 HL073042, P01 HL073042-039002] Funding Source: Medline

Ask authors/readers for more resources

We describe Studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case Studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, its well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived factors as representing biological subpathway structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well its scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric component,,. for latent structure. underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway Studies. Supplementary supporting material provides more details of the applications, its well as examples of the use of freely available software tools for implementing the methodology.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available