4.3 Article

Semi- and fully supervised quantification techniques to improve population estimates from machine classifiers

Journal

LIMNOLOGY AND OCEANOGRAPHY-METHODS
Volume 18, Issue 12, Pages 739-753

Publisher

WILEY
DOI: 10.1002/lom3.10399

Keywords

-

Funding

  1. National Science Foundation BIGDATA Initiative [NSF IIS 15-46351]
  2. Simons Foundation

Ask authors/readers for more resources

Modern in situ digital imaging systems collect vast numbers of images of marine organisms and suspended particles. Automated methods to classify objects in these images - largely supervised machine learning techniques - are now used to deal with this onslaught of biological data. Though such techniques can minimize the human cost of analyzing the data, they also have important limitations. In training automated classifiers, we implicitly program them with an inflexible understanding of the environment they are observing. When the relationship between the classifier and the population changes, the computer's performance degrades, potentially decreasing the accuracy of the estimate of community composition. This limitation of automated classifiers is known as dataset shift. Here, we describe techniques for addressing dataset shift. We then apply them to the output of a binary deep neural network searching for diatom chains in data generated by the Scripps Plankton Camera System (SPCS) on the Scripps Pier. In particular, we describe a supervised quantification approach to adjust a classifier's output using a small number of human corrected images to estimate the system error in a time frame of interest. This method yielded an 80% improvement in mean absolute error over the raw classifier output on a set of 41 independent samples from the SPCS. The technique can be extended to adjust the output of multi-category classifiers and other in situ observing systems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available