☆ 4.5 Article

Assisting Multimodal Named Entity Recognition by cross-modal auxiliary tasks

PATTERN RECOGNITION LETTERS (2023)

期刊

PATTERN RECOGNITION LETTERS

卷 175, 期 -, 页码 52-58

出版社

ELSEVIER

DOI: 10.1016/j.patrec.2023.10.004

关键词

Multimodal named entity recognition; Multi-task learning; Cross-modal learning

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces a method for improving the performance of Multimodal Named Entity Recognition (MNER) through cross-modal auxiliary tasks. The method utilizes cross-modal matching and cross-modal mutual information maximization to address the issue of mismatched image-text pairs, and separates the features of the main task and auxiliary tasks through a cross-modal gate-control mechanism.

Although the existing Multimodal Named Entity Recognition (MNER) methods have achieved promising performance, they suffer from the following drawbacks in social media scenarios. Firstly, most existing methods are based on a strong assumption that the textual content and the associated images are matched, which is not always valid in real scenarios; Secondly, current methods fail to filter out modality-specific random noise, which impedes models from exploiting modality-shared features. In this paper, a novel multi-task multimodal learning architecture is put forward, which aims to improve Multimodal Named Entity Recognition (MNER) performance by cross-modal auxiliary tasks (CMAT). Specifically, we first separate the shared and task-specific features for the main task and auxiliary tasks respectively, which is accomplished by cross-modal gate-control mechanism. Subsequently, without extra pre-processing or annotations, we utilize the cross-modal matching to address the issue of mismatched image-text pairs, and the cross-modal mutual information maximization to optimize the most relevant cross-modal features. Moreover, experimental results on the two widely used datasets confirm the superiority of our proposed approach.

Assisting Multimodal Named Entity Recognition by cross-modal auxiliary tasks

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Assisting Multimodal Named Entity Recognition by cross-modal auxiliary tasks

期刊

PATTERN RECOGNITION LETTERS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文