4.4 Article

Efficient Generation of Training Libraries for Image Classification Models from Photos of Herbarium Specimens

Journal

Publisher

UNIV CHICAGO PRESS
DOI: 10.1086/724950

Keywords

computer vision; image classification; Waitzia; Asteraceae; herbarium; identification

Categories

Ask authors/readers for more resources

By annotating specimen images and cropping annotations using open-source software and a custom script, a training library for image classification can be generated quickly. The approach demonstrated in this research allows taxonomists to use digitized herbarium specimens to produce training libraries within hours. It is expected that computer vision will increasingly become a part of taxonomic practice.
Premise of research. Computer vision has the potential to become a transformative identification tool in biodiversity research and collections management, allowing high-throughput identification and removing the need for nonexpert end users to understand technical terminology. A major bottleneck for taxonomists is the generation of sufficient numbers of training images. Contemporary large-scale imaging projects of herbaria provide an increasing number of specimen photos, but whole-sheet images are not directly suitable for training image classification models targeted at individual taxonomically informative characters.Methodology. Here, we illustrate a time- and labor-efficient approach for generating training libraries for image classification from photos of herbarium sheets. It involves the annotation of specimen images with bounding boxes using open-source software and automated cropping of annotations with a custom script to produce the training library. We demonstrate the approach on the flower heads of a genus of Asteraceae comprising eight taxa, six species and two nontypus varieties.Pivotal results. After generating 816 training images from 33 specimen photos with a time investment of only similar to 90 min, we trained an image classification model that achieved 98.2% precision and recall.Conclusions. The demonstrated approach allows taxonomists to use digitized herbarium specimens to produce training libraries for image classification models within hours. We expect that computer vision will increasingly become a part of taxonomic practice.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available