Journal
LIMNOLOGY AND OCEANOGRAPHY-METHODS
Volume 18, Issue 12, Pages 739-753Publisher
WILEY
DOI: 10.1002/lom3.10399
Keywords
-
Categories
Funding
- National Science Foundation BIGDATA Initiative [NSF IIS 15-46351]
- Simons Foundation
Ask authors/readers for more resources
Modern in situ digital imaging systems collect vast numbers of images of marine organisms and suspended particles. Automated methods to classify objects in these images - largely supervised machine learning techniques - are now used to deal with this onslaught of biological data. Though such techniques can minimize the human cost of analyzing the data, they also have important limitations. In training automated classifiers, we implicitly program them with an inflexible understanding of the environment they are observing. When the relationship between the classifier and the population changes, the computer's performance degrades, potentially decreasing the accuracy of the estimate of community composition. This limitation of automated classifiers is known as dataset shift. Here, we describe techniques for addressing dataset shift. We then apply them to the output of a binary deep neural network searching for diatom chains in data generated by the Scripps Plankton Camera System (SPCS) on the Scripps Pier. In particular, we describe a supervised quantification approach to adjust a classifier's output using a small number of human corrected images to estimate the system error in a time frame of interest. This method yielded an 80% improvement in mean absolute error over the raw classifier output on a set of 41 independent samples from the SPCS. The technique can be extended to adjust the output of multi-category classifiers and other in situ observing systems.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available