4.7 Review

The National Eutrophication Survey: lake characteristics and historical nutrient concentrations

Journal

EARTH SYSTEM SCIENCE DATA
Volume 10, Issue 1, Pages 81-86

Publisher

COPERNICUS GESELLSCHAFT MBH
DOI: 10.5194/essd-10-81-2018

Keywords

-

Funding

  1. Mozilla Foundation
  2. Leona M. and Harry B. Helmsley Charitable Trust
  3. Michigan State University Program in Ecology and Evolutionary Biology
  4. BEACON Center for the Study of Evolution in Action
  5. Kellogg Biological Station Long-Term Ecological Research site (NSF-DEB) [1027253]
  6. National Science Foundation [ICER-1517823]
  7. Division Of Environmental Biology
  8. Direct For Biological Sciences [1027253] Funding Source: National Science Foundation

Ask authors/readers for more resources

Historical ecological surveys serve as a baseline and provide context for contemporary research, yet many of these records are not preserved in a way that ensures their long-term usability. The National Eutrophication Survey (NES) database is currently only available as scans of the original reports (PDF files) with no embedded character information. This limits its searchability, machine readability, and the ability of current and future scientists to systematically evaluate its contents. The NES data were collected by the US Environmental Protection Agency between 1972 and 1975 as part of an effort to investigate eutrophication in freshwater lakes and reservoirs. Although several studies have manually transcribed small portions of the database in support of specific studies, there have been no systematic attempts to transcribe and preserve the database in its entirety. Here we use a combination of automated optical character recognition and manual quality assurance procedures to make these data available for analysis. The performance of the optical character recognition protocol was found to be linked to variation in the quality (clarity) of the original documents. For each of the four archival scanned reports, our quality assurance protocol found an error rate between 5.9 and 17 %. The goal of our approach was to strike a balance between efficiency and data quality by combining entry of data by hand with digital transcription technologies. The finished database contains information on the physical characteristics, hydrology, and water quality of about 800 lakes in the contiguous US (Stachelek et al., 2017, https://doi.org/10.5063/F1639MVD). Ultimately, this database could be combined with more recent studies to generate meta-analyses of water quality trends and spatial variation across the continental US.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available