4.4 Article

The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research

Journal

JMIR MEDICAL INFORMATICS
Volume 11, Issue -, Pages -

Publisher

JMIR PUBLICATIONS, INC
DOI: 10.2196/48030

Keywords

data analysis pipeline; federated model sharing; real-world data; evidence-based decision-making; end-to-end pipeline; multiple sclerosis; data analysis; pipeline; data science; federated; neurology; brain; spine; spinal nervous system; neuroscience; data sharing; rare; low prevalence

Ask authors/readers for more resources

This study presents a comprehensive data analysis pipeline driven by multiple stakeholders, which accommodates three prevalent data-sharing streams and has been successfully implemented in the global data sharing initiatives for multiple sclerosis and COVID-19. The pipeline facilitates data gathering from various sources and integrates them into a unified dataset for subsequent statistical analysis and secure data examination.
Background: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence.Objective: This study aims to present a comprehensive, research question-agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing.Methods: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline's effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative.Results: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19.Conclusions: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available