☆ 4.3 Article

Semi- and fully supervised quantification techniques to improve population estimates from machine classifiers

LIMNOLOGY AND OCEANOGRAPHY-METHODS (2020)

Journal

LIMNOLOGY AND OCEANOGRAPHY-METHODS

Volume 18, Issue 12, Pages 739-753

Publisher

WILEY

DOI: 10.1002/lom3.10399

Keywords

Funding

National Science Foundation BIGDATA Initiative [NSF IIS 15-46351]
Simons Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Modern in situ digital imaging systems collect vast numbers of images of marine organisms and suspended particles. Automated methods to classify objects in these images - largely supervised machine learning techniques - are now used to deal with this onslaught of biological data. Though such techniques can minimize the human cost of analyzing the data, they also have important limitations. In training automated classifiers, we implicitly program them with an inflexible understanding of the environment they are observing. When the relationship between the classifier and the population changes, the computer's performance degrades, potentially decreasing the accuracy of the estimate of community composition. This limitation of automated classifiers is known as dataset shift. Here, we describe techniques for addressing dataset shift. We then apply them to the output of a binary deep neural network searching for diatom chains in data generated by the Scripps Plankton Camera System (SPCS) on the Scripps Pier. In particular, we describe a supervised quantification approach to adjust a classifier's output using a small number of human corrected images to estimate the system error in a time frame of interest. This method yielded an 80% improvement in mean absolute error over the raw classifier output on a set of 41 independent samples from the SPCS. The technique can be extended to adjust the output of multi-category classifiers and other in situ observing systems.

Semi- and fully supervised quantification techniques to improve population estimates from machine classifiers

Journal

LIMNOLOGY AND OCEANOGRAPHY-METHODS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Semi- and fully supervised quantification techniques to improve population estimates from machine classifiers

Journal

LIMNOLOGY AND OCEANOGRAPHY-METHODS

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper