4.7 Article

Species misidentification affects biodiversity metrics: Dealing with this issue using the new R package naturaList

Journal

ECOLOGICAL INFORMATICS
Volume 69, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.ecoinf.2022.101625

Keywords

Occurrence records; Biodiversity; Species misidentification; Taxonomy; Ecological patterns

Categories

Funding

  1. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior-Brasil (CAPES) [001, 307527/2018-2]
  2. CNPq Productivity Fellowship [proc. 465610/2014-5]
  3. National Institutes for Science and Technology (INCT) in Ecology, Evolution
  4. MCTIC/CNPq [proc. 201810267000023]
  5. FAPEG [465610/2014-5]

Ask authors/readers for more resources

Biodiversity databases have contributed to advances in ecology and evolution, but species misidentifications are common in occurrence datasets. This study evaluates the impact of misidentifications on capturing ecological patterns and presents an R package called naturaList for classifying species occurrence data based on identification reliability. It also highlights the importance of automated tools in ensuring reliability in species identification and improving the quality of large-scale studies.
Biodiversity databases are increasingly available and have fostered accelerated advances in many disciplines within ecology and evolution. However, the quality of the evidence generated depends critically on the quality of the input data, and species misidentifications are present in virtually any occurrence dataset. Yet, the lack of automatized tools makes the assessment of the quality of species identification in big datasets time-consuming, which often induces researchers to assume that all species are reliably identified. In this study, we address this issue by evaluating how species misidentification can impact our ability to capture ecological patterns, and by presenting an R package, called naturaList, designed to classify species occurrence data according to identification reliability. naturaList allows the classification of species occurrences up to six confidence levels, in which the highest level is assigned to records identified by specialists. We obtained a list of specialists by using the species occurrence dataset itself, based on the identifier names within it, and by entering an independent list, obtained by contacting experts. Further, we evaluate the effects of filtering out occurrence records not identified by specialists on the estimations of species niche and diversity patterns. We used the tribe Myrteae (Myrtaceae) as a study model, which is a species-rich group in Central and South America and with challenging taxonomy. We found a significant change in species niche in 13% of species when using only occurrences identified by specialists. We found changes in patterns of alpha diversity in four genera and changes in beta diversity in all genera analyzed. We show how the uncertainty in species identification in occurrence datasets affects conclusions on macroecological patterns by generating bias or noise in different aspects of macroecological patterns (niche, alpha, and beta diversity). Therefore, to guarantee reliability in species identification in big data sets we recommend the use of automated tools such as the naturaList package, especially when analyzing variation in species composition. This study also represents a step forward to increasing the quality of large-scale studies that rely on species occurrence data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available