4.6 Article

Semi-supervised deep learning and low-cost cameras for the semantic segmentation of natural images in viticulture

期刊

PRECISION AGRICULTURE
卷 23, 期 6, 页码 2001-2026

出版社

SPRINGER
DOI: 10.1007/s11119-022-09929-9

关键词

Semantic segmentation; Semi-supervised learning; Grape bunches; Natural images; Agricultural robot sensing

资金

  1. Ministerio de Ciencia e Innovacion
  2. FPI grant from Community of La Rioja
  3. Agricultural Interoperability and Analysis System (ATLAS)
  4. European Union [857125]
  5. Multimodal Sensing for Individual Plant Phenotyping in agriculture robotics (ANTONIO)
  6. ICT-AGRI-FOOD COFUND [41946]
  7. E-CROPS -Tecnologie per l'Agricoltura Digitale Sostenibile, PON Ricerca e Innovazione [ARS01_01136]
  8. H2020 Societal Challenges Programme [857125] Funding Source: H2020 Societal Challenges Programme

向作者/读者索取更多资源

This study proposes a deep learning-based method for semantic segmentation of natural images acquired by low-cost cameras, and improves the segmentation accuracy through three semi-supervised learning methods. Different network architectures achieved the highest accuracy on different classes. Further discussions are presented on the effects of manual annotation on accuracy and time requirements.
Automatic yield monitoring and in-field robotic harvesting by low-cost cameras require object detection and segmentation solutions to tackle the poor quality of natural images and the lack of exactly-labeled datasets of consistent sizes. This work proposed the application of deep learning for semantic segmentation of natural images acquired by a low-cost RGB-D camera in a commercial vineyard. Several deep architectures were trained and compared on 85 labeled images. Three semi-supervised learning methods (PseudoLabeling, Distillation and Model Distillation) were proposed to take advantage of 320 non-annotated images. In these experiments, the DeepLabV3+ architecture with a ResNext50 backbone, trained with the set of labeled images, achieved the best overall accuracy of 84.78%. In contrast, the Manet architecture combined with the EfficientnetB3 backbone reached the highest accuracy for the bunch class (85.69%). The application of semi-supervised learning methods boosted the segmentation accuracy between 5.62 and 6.01%, on average. Further discussions are presented to show the effects of a fine-grained manual image annotation on the accuracy of the proposed methods and to compare time requirements.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据