4.1 Article

Essential Regression: A generalizable framework for inferring causal latent factors from multi-omic datasets

Journal

PATTERNS
Volume 3, Issue 5, Pages -

Publisher

CELL PRESS
DOI: 10.1016/j.patter.2022.100473

Keywords

-

Funding

  1. NIH [DP2AI164325, U01HL137159, R01HL140963, R01HL159805, R01HL157879, U01AI141990, F31LM013966]
  2. NSF [DMS-1712709, DMS-2015195]
  3. DoD [W81XWH2110864]
  4. UPMC ITTC fund
  5. Yerkes Pilot Research Pilot Program (Yerkes NPRC Base Grant) [P51OD011132]

Ask authors/readers for more resources

High-dimensional cellular and molecular profiling of biological samples highlights the need for analytical approaches that can integrate multi-omic datasets to generate prioritized causal inferences. Here, we present Essential Regression (ER), a novel interpretable machine-learning approach that addresses these problems by identifying latent factors and their likely cause-effect relationships with system-wide outcomes/properties of interest. ER outperforms other methods in terms of prediction and can be coupled with probabilistic graphical modeling to strengthen causal inferences. The utility of ER is demonstrated using multi-omic system immunology datasets to generate and validate novel cellular and molecular inferences in various contexts.
High-dimensional cellular and molecular profiling of biological samples highlights the need for analytical approaches that can integrate multi-omic datasets to generate prioritized causal inferences. Current methods are limited by high dimensionality of the combined datasets, the differences in their data distributions, and their integration to infer causal relationships. Here, we present Essential Regression (ER), a novel latent-factor-regression-based interpretable machine-learning approach that addresses these problems by identifying latent factors and their likely cause-effect relationships with system-wide outcomes/properties of interest. ER can integrate many multi-omic datasets without structural or distributional assumptions regarding the data. It outperforms a range of state-of-the-art methods in terms of prediction. ER can be coupled with probabilistic graphical modeling, thereby strengthening the causal inferences. The utility of ER is demonstrated using multi-omic system immunology datasets to generate and validate novel cellular and molecular inferences in a wide range of contexts including immunosenescence and immune dysregulation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available