4.7 Article

BASIN-3D: A brokering framework to integrate diverse environmental data

Journal

COMPUTERS & GEOSCIENCES
Volume 159, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.cageo.2021.105024

Keywords

Data integration; Multiscale diverse data; Synthesis; Environmental data

Funding

  1. Watershed Function Scientific Focus Area
  2. iNAIADS DOE Early Career Project
  3. Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) - U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research [DE-AC02-05CH11231]
  4. National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility [DE-AC02-05CH11231]

Ask authors/readers for more resources

BASIN-3D is a data integration framework designed to dynamically retrieve and transform heterogeneous data from different sources into a common format. Users can adopt a standardized approach for data retrieval and avoid customizations for the data type or source, supporting both web-based tools and data analytics.
Diverse observational and simulation datasets are needed to understand and predict complex ecosystem behavior over seasonal to decadal and century time-scales. Integration of these datasets poses a major barrier towards advancing environmental science, particularly due to differences in the structure and formats of data provided by various sources. Here, we describe BASIN-3D (Broker for Assimilation, Synthesis and Integration of eNvironmental Diverse, Distributed Datasets), a data integration framework designed to dynamically retrieve and transform heterogeneous data from different sources into a common format to provide an integrated view. BASIN-3D enables users to adopt a standardized approach for data retrieval and avoid customizations for the data type or source. We demonstrate the value of BASIN-3D with two use cases that require integration of data from regional to watershed spatial scales. The first application uses the BASIN-3D Python library to integrate time-series hydrological and meteorological data to provide standardized inputs to analytical and machine learning codes in order to predict the impacts of hydrological disturbances on large river corridors of the United States. The second application uses the BASIN-3D Django framework to integrate diverse time-series data in a mountainous watershed in East River, Colorado, United States to enable scientific researchers to explore and download data through an interactive web portal. Thus, BASIN-3D can be used to support data integration for both web-based tools, as well as data analytics using Python scripting and extensions like Jupyter notebooks. The framework is expected to be transferable to and useful for many other field and modeling studies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available