4.8 Article

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2019.2927476

关键词

Cross-modal; deep learning; cooking recipes; food images

资金

  1. CSAIL-QCRI collaboration projects and the framework of projects - Spanish Ministerio de Economia y Competitividad [TEC2013-43935R, TEC2016-75976-R]
  2. European Regional Development Fund

向作者/读者索取更多资源

This paper introduces Recipe 1M+, a large-scale corpus of cooking recipes and food images, and demonstrates how training neural networks on this data can improve image-recipe retrieval tasks. Regularization through the addition of a high-level classification objective not only enhances retrieval performance but also enables semantic vector arithmetic.
In this paper, we introduce Recipe 1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipes 1M+ affords the ability to train high-capacity models on aligned, multimodal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipes 1M+ dataset and food and cooking in general. Code, data and models are publicly available.(1)

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据