4.8 Article

Poor data stewardship will hinder global genetic diversity surveillance

Publisher

NATL ACAD SCIENCES
DOI: 10.1073/pnas.2107934118

Keywords

genomic; metadata; conservation; biodiversity; management

Funding

  1. Diversity of the Indo-Pacific Network RCN [NSF-OCE-1764316, NSF-DEB-1457848]

Ask authors/readers for more resources

Genomic data are being generated and archived rapidly, but a lack of spatiotemporal metadata poses challenges for genetic diversity monitoring. Only a small fraction of genomic datasets contain geographic coordinates and collection years, highlighting the need for streamlined data processes and updated policies to address the growing metadata gap.
Genomic data are being produced and archived at a prodigious rate, and current studies could become historical baselines for future global genetic diversity analyses and monitoring programs. However, when we evaluated the potential utility of genomic data from wild and domesticated eukaryote species in the world's largest genomic data repository, we found that most archived genomic datasets (86%) lacked the spatiotemporal metadata necessary for genetic biodiversity surveillance. Labor-intensive scouring of a subset of published papers yielded geospatial coordinates and collection years for only 33% (39% if place names were considered) of these genomic datasets. Streamlined data input processes, updated metadata deposition policies, and enhanced scientific community awareness are urgently needed to preserve these irreplaceable records of today's genetic biodiversity and to plug the growing metadata gap.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available