期刊
ELECTRONICS
卷 10, 期 3, 页码 -出版社
MDPI
DOI: 10.3390/electronics10030279
关键词
object-detection metrics; precision; recall; evaluation; automatic assessment; bounding boxes
资金
- Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES) [001]
- Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)
- Fundacao de Amparo a Pesquisa do Estado do Rio de Janeiro (FAPERJ)
- Google Latin America Research Awards (LARA) 2020
This study provides an overview of evaluation methods used in object detection competitions, examines the influence of different annotation formats on evaluation results, and offers an open-source toolkit supporting various annotation formats and performance metrics for researchers to evaluate their detection algorithms. Furthermore, it introduces a new metric for evaluating object detection in videos based on spatio-temporal overlap.
Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据