4.7 Article

MVANet: Multi-Task Guided Multi-View Attention Network for Chinese Food Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 23, Issue -, Pages 3551-3561

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2020.3028478

Keywords

Task analysis; Semantics; Feature extraction; Image recognition; Deep learning; Shape; Fuses; Food recognition; convolutional neural network; multi-task learning; multi-view attention

Funding

  1. China National Science Foundation [61273363, 61976092]
  2. Guangdong Province Key Area RD Plan Project [2020B1111120001]
  3. Natural Science Foundation of Guangdong [2018A030313356]
  4. Guangzhou Science and Technology Planning Project [201604020179, 201803010088]

Ask authors/readers for more resources

Food recognition is critical in healthcare applications, but current methods face challenges due to diverse appearances and non-uniform composition of ingredients. The proposed Multi-View Attention Network incorporates multiple semantic features for comprehensive representation. Experiments show significant improvement in performance and reduced parameter size.
Food recognition plays a much critical role in various health-care applications. However, it poses many challenges to current approaches due to the diverse appearances of food dishes and the non-uniform composition of ingredients for the foods in the same category. Current methods primarily focus on the appearance of foods without considering their semantic information, easily finding the wrong attention areas of food images. Second, these methods lack the dynamic weighting of multiple semantic features in the modeling process. Thus this paper proposes a novel Multi-View Attention Network within the multi-task learning framework that incorporates multiple semantic features into the food recognition task from both ingredient recognition and recipe modeling. It also utilizes the multi-view attention mechanism to automatically adjust the weights of different semantic features and enables different tasks to interact with each other so as to obtain a more comprehensive feature representation. The experiments conducted on both ChineseFoodNet and VIREO Food-172 benchmark databases validate the proposed method with the obvious improvement of the performance and the lower parameter size.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available