4.7 Article

Research on Automatic Classification and Detection of Mutton Multi-Parts Based on Swin-Transformer

Journal

FOODS
Volume 12, Issue 8, Pages -

Publisher

MDPI
DOI: 10.3390/foods12081642

Keywords

mutton processing; computer vision; deep learning; classification; detection; livestock meat

Ask authors/readers for more resources

This paper proposes a mutton multi-part classification and detection method based on the Swin-Transformer, which uses image augmentation techniques to increase sample size and overcome dataset distribution and imbalance issues. The optimal model is obtained through comparison of three Swin-Transformer variants and transfer learning. The proposed method shows high performance in terms of accuracy, robustness, generalization, and anti-occlusion abilities, outperforming five commonly used object detection methods and meeting real-time processing requirements.
In order to realize the real-time classification and detection of mutton multi-part, this paper proposes a mutton multi-part classification and detection method based on the Swin-Transformer. First, image augmentation techniques are adopted to increase the sample size of the sheep thoracic vertebrae and scapulae to overcome the problems of long-tailed distribution and non-equilibrium of the dataset. Then, the performances of three structural variants of the Swin-Transformer (Swin-T, Swin-B, and Swin-S) are compared through transfer learning, and the optimal model is obtained. On this basis, the robustness, generalization, and anti-occlusion abilities of the model are tested and analyzed using the significant multiscale features of the lumbar vertebrae and thoracic vertebrae, by simulating different lighting environments and occlusion scenarios, respectively. Furthermore, the model is compared with five methods commonly used in object detection tasks, namely Sparser-CNN, YoloV5, RetinaNet, CenterNet, and HRNet, and its real-time performance is tested under the following pixel resolutions: 576 x 576, 672 x 672, and 768 x 768. The results show that the proposed method achieves a mean average precision (mAP) of 0.943, while the mAP for the robustness, generalization, and anti-occlusion tests are 0.913, 0.857, and 0.845, respectively. Moreover, the model outperforms the five aforementioned methods, with mAP values that are higher by 0.009, 0.027, 0.041, 0.050, and 0.113, respectively. The average processing time of a single image with this model is 0.25 s, which meets the production line requirements. In summary, this study presents an efficient and intelligent mutton multi-part classification and detection method, which can provide technical support for the automatic sorting of mutton as well as for the processing of other livestock meat.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available