4.7 Article

A multimodal generative and fusion framework for recognizing faculty homepages

Journal

INFORMATION SCIENCES
Volume 525, Issue -, Pages 205-220

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2020.03.005

Keywords

Homepages; Multimodal generative adversarial network; Gated fusion network

Funding

  1. National Natural Science Foundation of China (NSFC) [71671141, 71873108]
  2. Fundamental Research Funds for the Central Universities [JBK 171113, JBK 170505, JBK 1806003]
  3. Sichuan Province Science and Technology Department [2019YJ0250]
  4. Key Laboratory of Internet Natural Language Processing of the Sichuan Provincial Education Department
  5. Financial Innovation Center of Southwestern University of Finance and Economics
  6. Financial Intelligence and Financial Engineering Key Laboratory of Sichuan Province

Ask authors/readers for more resources

Multimodal data consist of several data modes, where each mode is a group of similar data sharing the same attributes. Recognizing faculty homepages is essentially a multimodal classification problem in which a target faculty homepage is determined from three different information sources, including text, images, and layout. Conventional strategies in previous studies have been either to concatenate features from various information sources into a compound vector or to input them separately into several different classifiers that are then assembled into a stronger classifier for the final prediction. However, both approaches ignore the connections among different feature sets. We argue that such relations are essential to enhance multimodal classification. Besides, recognizing faculty homepages is a class imbalance problem in which the total number of samples of a minority class is far smaller than the sample numbers of other classes. In this study, we propose a multimodal generative and fusion framework for multimodal learning with the problems of imbalanced data and mutually dependent feature modes. Specifically, a multimodal generative adversarial network is first introduced to rebalance the dataset by generating pseudo features based on each mode and combining them to describe a fake sample. Then, a gated fusion network with the gate and fusion mechanisms is presented to reduce the noise to improve the generalization ability and capture the links among the different feature modes. Experiments on a faculty homepage dataset show the superiority of the proposed framework. (C) 2020 Published by Elsevier Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available