4.5 Article

Foundry: a message-oriented, horizontally scalable ETL system for scientific data integration and enhancement

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/database/bay130

Keywords

-

Funding

  1. bioCADDIE project via an National Institute of Health's Big Data to Knowledge award [U24AI117966]
  2. Community Inventory of EarthCube Resources for Geosciences Interoperability (CINERGI) project via National Science Foundation [ICER/GEO 1343816, 1639764]
  3. National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Information Network (dkNET) via NIH's NIDDK [U24DK097771]
  4. Neuroscience Information Framework via NIH's National Institute of Drug Abuse award [U24DA039832]
  5. Repronim via NIH's National Institute of Biomedical Imaging and Bioengineering award [P41EB019936]
  6. University of California, San Diego, Center for Research in Biological Systems
  7. ICER
  8. Directorate For Geosciences [1639764] Funding Source: National Science Foundation

Ask authors/readers for more resources

Data generated by scientific research enables further advancement in science through reanalyses and pooling of data for novel analyses. With the increasing amounts of scientific data generated by biomedical research providing researchers with more data than they have ever had access to, finding the data matching the researchers' requirements continues to be a major challenge and will only grow more challenging as more data is produced and shared. In this paper, we introduce a horizontally scalable distributed extract-transform-load system to tackle scientific data aggregation, transformation and enhancement for scientific data discovery and retrieval. We also introduce a data transformation language for biomedical curators allowing for the transformation and combination of data/metadata from heterogeneous data sources. Applicability of the system for scientific data is illustrated in biomedical and earth science domains.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available