4.8 Article

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2019.2927476

Keywords

Cross-modal; deep learning; cooking recipes; food images

Funding

  1. CSAIL-QCRI collaboration projects and the framework of projects - Spanish Ministerio de Economia y Competitividad [TEC2013-43935R, TEC2016-75976-R]
  2. European Regional Development Fund

Ask authors/readers for more resources

This paper introduces Recipe 1M+, a large-scale corpus of cooking recipes and food images, and demonstrates how training neural networks on this data can improve image-recipe retrieval tasks. Regularization through the addition of a high-level classification objective not only enhances retrieval performance but also enables semantic vector arithmetic.
In this paper, we introduce Recipe 1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipes 1M+ affords the ability to train high-capacity models on aligned, multimodal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipes 1M+ dataset and food and cooking in general. Code, data and models are publicly available.(1)

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available