4.6 Article

The necessary optimization of the data lifecycle: Marine geosciences in the big data era

Journal

FRONTIERS IN EARTH SCIENCE
Volume 10, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/feart.2022.1089112

Keywords

big data; data lifecycle; database; data acquisition; data curation; data-driven; data integration

Ask authors/readers for more resources

In marine geosciences, research vessels are used for data acquisition to study specific phenomena or areas of interest. Despite a plateau in ship time and active research vessels, data production in marine geosciences continues to increase. Legacy data repositories contain a large amount of data, but they are rarely curated for accessibility, resulting in inefficient use and exclusion of high-quality data. This paper discusses improvements in data acquisition, curation, and integration to align marine geosciences with the big data paradigm and addresses challenges and solutions in utilizing both new and legacy data.
In the marine geosciences, observations are typically acquired using research vessels to understand a given phenomenon or area of interest. Despite the plateauing of ship time and active research vessels in the last decade, the rate of marine geoscience data production has continued to increase. Simultaneously, there exists large quantities of legacy data aggregated within data repositories; however, these data are rarely curated to be both discoverable and machine-readable (i.e., accessible). This results in inefficient use, or even omission, of high-quality data, that is, both increasingly important to utilize and impractical to recollect. The proliferation of newly acquired data, and increasing importance of legacy data, has only been met with incremental evolution in the methods of data integration. This paper describes some improvements at each stage of the data lifecycle (acquisition, curation, and integration) that could align the marine geosciences better with the big data paradigm. We have encountered several major issues coordinating these efforts which we outline here: 1) geologic anomalies are the primary focus of data acquisition and pose difficulty in understanding the dominant (i.e., baseline) marine geology, 2) marine geoscience data are rarely curated to be accessible, and 3) aforementioned issues preclude the use of efficient integration tools that can make optimal use of data. In this paper, we discuss challenges and solutions associated with these issues to overcome these concerns in future decades of marine geoscience. The successful execution of these interconnected steps will optimize the lifecycle of marine geoscience data in the big data era.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available