4.7 Article

Multimodal Affective Computing With Dense Fusion Transformer for Inter- and Intra-Modality Interactions

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

CTNet: Context-Based Tandem Network for Semantic Segmentation

Zechao Li et al.

Summary: This study proposes a novel Context-based Tandem Network (CTNet) that explores spatial and channel contextual information for semantic segmentation. The CTNet demonstrates superior performance by adaptively integrating the results of two context modules, leading to improved learning representations.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Computer Science, Information Systems

A Unimodal Representation Learning and Recurrent Decomposition Fusion Structure for Utterance-Level Multimodal Embedding Learning

Sijie Mai et al.

Summary: Recently, the approach of learning a unified embedding for utterance-level video has sparked significant interest, focusing on exploring high-level representations for more representative and abstract semantic information. Through innovative fusion processes and structures, we successfully merge representations of all modalities into a unified embedding, achieving improved performance.

IEEE TRANSACTIONS ON MULTIMEDIA (2022)

Article Computer Science, Artificial Intelligence

Multi-Fusion Residual Memory Network for Multimodal Human Sentiment Comprehension

Sijie Mai et al.

Summary: This article introduces a hierarchical learning architecture for multimodal human sentiment comprehension and proposes methods to address issues of time-dependent interactions and long sequence processing.

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING (2022)

Article Computer Science, Information Systems

A Matrix Factorization Based Framework for Fusion of Physical and Social Sensors

Yuhui Wang et al.

Summary: This paper proposes a novel unified matrix factorization-based model to fuse physical and social sensor signals for spatio-temporal analysis, addressing challenges caused by data noise and heterogeneous data. Experimental results demonstrate that the proposed approach performs better in various situational understanding tasks.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Information Systems

LD-MAN: Layout-Driven Multimodal Attention Network for Online News Sentiment Recognition

Wenya Guo et al.

Summary: The prevailing use of both images and text on the web necessitates multimodal sentiment recognition. It is challenging to predict readers' sentiment after reading online news articles due to their complex structures. A layout-driven multimodal attention network is proposed to address this issue effectively.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Information Systems

Self-Adaptive Neural Module Transformer for Visual Question Answering

Huasong Zhong et al.

Summary: Vision and language understanding is a fundamental and difficult task in Multimedia Intelligence, with Visual Question Answering (VQA) being even more challenging. A novel Self-Adaptive Neural Module Transformer (SANMT) is proposed in this paper, which dynamically adjusts question feature encoding and layout decoding based on intermediate Q&A results. Extensive experiments show SANMT outperforms NMN on several benchmarks.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Acoustics

Analyzing Multimodal Sentiment Via Acoustic- and Visual-LSTM With Channel-Aware Temporal Convolution Network

Sijie Mai et al.

Summary: The study focuses on learning inter-modality dynamics through acoustic- and visual-LSTMs where language features play a dominant role. A 'channel-aware' temporal convolution network is introduced in the unimodal representation learning stage to extract high-level representations for each modality.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Article Computer Science, Information Systems

Locally Confined Modality Fusion Network With a Global Perspective for Multimodal Human Affective Computing

Sijie Mai et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2020)

Article Computer Science, Information Systems

A Deep Multi-task Contextual Attention Framework for Multi-modal Affect Analysis

Md Shad Akhtar et al.

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA (2020)

Article Computer Science, Artificial Intelligence

Deep Collaborative Embedding for Social Image Understanding

Zechao Li et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2019)

Article Computer Science, Information Systems

Robust unsupervised domain adaptation for neural networks via moment alignment

Werner Zellinger et al.

INFORMATION SCIENCES (2019)

Article Computer Science, Artificial Intelligence

Multimodal sentiment analysis using hierarchical fusion with context modeling

N. Majumder et al.

KNOWLEDGE-BASED SYSTEMS (2018)

Editorial Material Computer Science, Artificial Intelligence

Multimodal Sentiment Intensity Analysis in Videos: Facial Gestures and Verbal Messages

Amir Zadeh et al.

IEEE INTELLIGENT SYSTEMS (2016)

Article Computer Science, Information Systems

A Survey on Visual Analytics of Social Media Data

Yingcai Wu et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2016)

Proceedings Paper Computer Science, Artificial Intelligence

Deep Multimodal Fusion for Persuasiveness Prediction

Behnaz Nojavanasghari et al.

ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (2016)

Article Computer Science, Information Systems

Weakly Supervised Deep Metric Learning for Community-Contributed Image Retrieval

Zechao Li et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2015)

Article Computer Science, Information Systems

Using Audio-Derived Affective Offset to Enhance TV Recommendation

Sven Ewan Shepstone et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2014)

Article Computer Science, Artificial Intelligence

YouTube Movie Reviews: Sentiment Analysis in an Audio-Visual Context

Martin Woellmer et al.

IEEE INTELLIGENT SYSTEMS (2013)

Article Acoustics

Speaker identification on the SCOTUS corpus

Jiahong Yuan et al.

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA (2008)

Article Computer Science, Interdisciplinary Applications

IEMOCAP: interactive emotional dyadic motion capture database

Carlos Busso et al.

LANGUAGE RESOURCES AND EVALUATION (2008)