4.7 Article

Targeted-BEHRT: Deep Learning for Observational Causal Inference on Longitudinal Electronic Health Records

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2022.3183864

Keywords

Estimation; Data models; Deep learning; Cancer; Predictive models; Task analysis; Benchmark testing; Causal inference; deep learning; electronic health records (EHRs); machine learning

Funding

  1. British Heart Foundation (BHF) [FS/PhD/21/29110, PG/18/65/33872]
  2. UK Research and Innovation (UKRI) through the Global Challenges Research Fund (GCRF) [ES/P0110551/1]
  3. Oxford National Institute of Health Research (NIHR) Biomedical Research Centre
  4. Oxford Martin School (OMS), University of Oxford

Ask authors/readers for more resources

Observational causal inference plays an important role in decision-making when randomized clinical trials are not feasible. This study explores the use of a transformer-based model coupled with doubly robust estimation for causal modeling in electronic health records. The model provides accurate estimates of risk ratio and shows consistency with results derived from randomized clinical trials.
Observational causal inference is useful for decision-making in medicine when randomized clinical trials (RCTs) are infeasible or nongeneralizable. However, traditional approaches do not always deliver unconfounded causal conclusions in practice. The rise of doubly robust nonparametric tools coupled with the growth of deep learning for capturing rich representations of multimodal data offers a unique opportunity to develop and test such models for causal inference on comprehensive electronic health records (EHRs). In this article, we investigate causal modeling of an RCT-established causal association: the effect of classes of antihypertensive on incident cancer risk. We develop a transformer-based model, targeted bidirectional EHR transformer (T-BEHRT) coupled with doubly robust estimation to estimate average risk ratio (RR). We compare our model to benchmark statistical and deep learning models for causal inference in multiple experiments on semi-synthetic derivations of our dataset with various types and intensities of confounding. In order to further test the reliability of our approach, we test our model on situations of limited data. We find that our model provides more accurate estimates of relative risk least sum absolute error (SAE) from ground truth compared with benchmark estimations. Finally, our model provides an estimate of class-wise antihypertensive effect on cancer risk that is consistent with results derived from RCTs.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available