3.8 Article

Embedding data provenance into the Learning Health System to facilitate reproducible research

Journal

LEARNING HEALTH SYSTEMS
Volume 1, Issue 2, Pages -

Publisher

WILEY
DOI: 10.1002/lrh2.10019

Keywords

data provenance; health informatics; reproducibility

Funding

  1. United Kingdom's Engineering and Physical Sciences Research Council [EP/N027426/1]
  2. Engineering and Physical Sciences Research Council [EP/N027426/1] Funding Source: researchfish
  3. EPSRC [EP/N027426/1] Funding Source: UKRI

Ask authors/readers for more resources

Introduction The learning health system (LHS) community has taken up the challenge of bringing the complex relationship between clinical research and practice into this brave new world. At the heart of the LHS vision is the notion of routine capture, transformation, and dissemination of data and knowledge, with various use cases, such as clinical studies, quality improvement initiatives, and decision support, constructed on top of specific routes that the data is taking through the system. In order to stop this increased data volume and analytical complexity from obfuscating the research process, it is essential to establish trust in the system through implementing reproducibility and auditability throughout the workflow. Methods Data provenance technologies can automatically capture the trace of the research task and resulting data, thereby facilitating reproducible research. While some computational domains, such as bioinformatics, have embraced the technology through provenance-enabled execution middlewares, disciplines based on distributed, heterogeneous software, such as medical research, are only starting on the road to adoption, motivated by the institutional pressures to improve transparency and reproducibility. Results Guided by the experiences of the TRANSFoRm project, we present the opportunities that data provenance offers to the LHS community. We illustrate how provenance can facilitate documenting 21 CFR Part 11 compliance for Food and Drug Administration submissions and provide auditability for decisions made by the decision support tools and discuss the transformational effect of routine provenance capture on data privacy, study reporting, and publishing medical research. Conclusions If the scaling up of the LHS is to succeed, we have to embed mechanisms to verify trust in the system inside our research instruments. In the research world increasingly reliant on electronic tools, provenance gives us a lingua franca to achieve traceability, which we have shown to be essential to building these mechanisms. To realize the vision of making computable provenance a feasible approach to implementing reproducibility in the LHS, we have to provide viable mechanisms for adoption. These include defining meaningful provenance models for problem domains and also introducing provenance support to existing tools in a minimally invasive manner.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available