4.7 Article

YOLO-MSA: A Multiscale Stereoscopic Attention Network for Empty-Dish Recycling Robots

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIM.2023.3315355

Keywords

Attention mechanisms; empty-dish recycling robots; multiscale feature learning; object detection

Ask authors/readers for more resources

As global population ages and labor force decreases, the use of artificial intelligence technology to enhance labor productivity becomes a popular topic. This article proposes a multiscale stereoscopic attention network (YOLO-MSA) to detect postprandial dishes for empty-dish recycling robots, and extensive experiments demonstrate its effectiveness and robustness.
As the global population ages and the labor force shrinks, using artificial intelligence (AI) technology to promote labor productivity growth has become a hot topic. The emergence of empty-dish recycling robots has effectively alleviated the impact of the decline in labor productivity. This article proposes a multiscale stereoscopic attention (MSA) network YOLO-MSA to detect postprandial dishes for empty-dish recycling robots. First, the standard convolution is replaced with a Res2Net module, which improves the multiscale expressiveness of the network at a finer-grained level. Second, we adopt a Res2Net with different dilation rates and a novel stereoscopic attention mechanism to propose an MSA module, which is used for coarse-grained multiscale expression. Third, for multiscale feature learning in the dimensionality reduction process, dimension reduction spatial pyramid pooling (DRSPP) is proposed to fuse feature maps of different scales. Extensive experiments demonstrate the effectiveness of the proposed MSA module for multiscale feature learning. Furthermore, YOLO-MSA has achieved 98.47% mean Average Precision (mAP) on Dish-21, a dataset of the postprandial dishes, which is much higher than other state-of-the-art (SOTA) models, and has achieved an inference speed of 33.93 frames per second (FPS), which meets the needs of real-time detection of the postprandial dish for the empty-dish recycling robot. Test results on other public datasets show that the proposed YOLO-MSA has a better generalization ability. In summary, YOLO-MSA exhibits satisfactory multiscale feature expression ability, demonstrates effectiveness and robustness in postprandial dish detection, and has far-reaching significance for the development of empty-dish recycling robots.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available