4.5 Article

Assisting Multimodal Named Entity Recognition by cross-modal auxiliary tasks

Related references

Note: Only part of the references are listed.
Letter Computer Science, Information Systems

Deep active sampling with self-supervised learning

Haochen Shi et al.

FRONTIERS OF COMPUTER SCIENCE (2023)

Letter Computer Science, Information Systems

Person video alignment with human pose registration

Yu Zhang et al.

FRONTIERS OF COMPUTER SCIENCE (2023)

Article Computer Science, Artificial Intelligence

Transformer-based approach for joint handwriting and named entity recognition in historical document

Ahmed Cheikh Rouhou et al.

Summary: This paper proposes an end-to-end transformer-based approach to jointly perform text transcription and named entity recognition tasks in handwriting documents. The approach operates at the paragraph level, avoiding early errors and improving prediction accuracy. Different training scenarios and a two-stage learning strategy are explored and shown to be effective.

PATTERN RECOGNITION LETTERS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

MAF: A General Matching and Alignment Framework for Multimodal Named Entity Recognition

Bo Xu et al.

Summary: This paper studies multimodal named entity recognition in social media posts and proposes a general matching and alignment framework. By introducing novel cross-modal matching and alignment modules, the weaknesses of current methods are addressed, resulting in improved effectiveness and efficiency of the multimodal named entity recognition model.

WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (2022)

Article Computer Science, Artificial Intelligence

Deep multimodal learning for cross-modal retrieval: One model for all tasks

L. Viviana Beltran Beltran et al.

Summary: The study investigates the effectiveness of applying a successful VQA model to a cross-modal retrieval system, which combines visual and textual representations for information search through deep multimodal learning. Results indicate that the model produces improved or competitive outcomes in various retrieval tasks.

PATTERN RECOGNITION LETTERS (2021)

Article Computer Science, Artificial Intelligence

Self-attention-based conditional random fields latent variables model for sequence labeling

Yinan Shao et al.

Summary: To process data like text and speech, Natural Language Processing (NLP) is a valuable tool. Sequence labeling is a vital part of NLP through techniques like text classification, machine translation, and sentiment analysis. Two novel frameworks, SA-CRFLV-I and SA-CRFLV-II, using latent variables within random fields show better performance in terms of well-known metrics compared to 4 well-known sequence prediction methodologies.

PATTERN RECOGNITION LETTERS (2021)

Article Computer Science, Artificial Intelligence

Fuzzy commonsense reasoning for multimodal sentiment analysis

Iti Chaturvedi et al.

PATTERN RECOGNITION LETTERS (2019)

Article Computer Science, Artificial Intelligence

Recognizing irregular entities in biomedical text via deep neural networks

Fei Li et al.

PATTERN RECOGNITION LETTERS (2018)

Article Computer Science, Artificial Intelligence

Effective integration of morphological analysis and named entity recognition based on a recurrent neural network

Hyeon-gu Lee et al.

PATTERN RECOGNITION LETTERS (2018)