4.7 Article

A Deep Multimodal Learning Approach to Perceive Basic Needs of Humans From Instagram Profile

Journal

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
Volume 14, Issue 2, Pages 944-956

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TAFFC.2021.3090809

Keywords

Social media; multi-modal learning; multi-label classifier; choice theory; deep learning; bag of content

Ask authors/readers for more resources

This article proposes the use of multimodal data collected from Instagram accounts to predict the five basic prototypical needs described in Glasser's choice theory. By automating the identification of unconsciously perceived needs from Instagram profiles using visual and textual contents, a fusion of both modalities is performed through multi-label classification. The proposed approach shows promising accuracy and complementary information between visual and textual cues, as validated on a large database consensually annotated by two expert psychologists.
Nowadays, a significant part of our time is spent sharing multimodal data on social media sites such as Instagram, Facebook and Twitter. The particular way through which users present themselves to social media can provide useful insights into their behaviours, personalities, perspectives, motives and needs. This article proposes to use multimodal data collected from Instagram accounts to predict the five basic prototypical needs described in Glasser's choice theory (i.e., Survival, Power, Freedom, Belonging, and Fun). We automate the identification of the unconsciously perceived needs from Instagram profiles by using both visual and textual contents. The proposed approach aggregates the visual and textual features extracted using deep learning and constructs a homogeneous representation for each profile through the proposed Bag-of-Content. Finally, we perform multi-label classification on the fusion of both modalities. We validate our proposal on a large database, consensually annotated by two expert psychologists, with more than 30,000 images, captions and comments. Experiments show promising accuracy and complementary information between visual and textual cues.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available