Related references
Note: Only part of the references are listed.
Article
Computer Science, Information Systems
Desheng Cai et al.
Summary: This paper proposes a novel Heterogeneous Hierarchical Feature Aggregation Network (HHFAN) for personalized micro-video recommendation. The network aims to explore the relationships among users, micro-videos, and related multi-modal information, and generate high-quality user and micro-video embeddings. Experimental results demonstrate that the proposed model outperforms baseline methods.
IEEE TRANSACTIONS ON MULTIMEDIA
(2022)
Article
Computer Science, Artificial Intelligence
Dawei Zhao et al.
Summary: This paper proposes a novel method for multi-view multi-label learning that enhances learning effectiveness by using view-specific labels and maximizing label-feature dependence. Experimental results show that the proposed method outperforms existing methods on several benchmark datasets.
APPLIED SOFT COMPUTING
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Ze Liu et al.
Summary: This paper introduces a Transformer architecture with a bias towards locality in video recognition, achieving a better balance between speed and accuracy compared to global self-attention mechanisms; by adapting the Swin Transformer and leveraging pre-trained models, it achieves state-of-the-art accuracy on various video recognition benchmarks.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022)
(2022)
Article
Engineering, Electrical & Electronic
Zhenxing Zhou et al.
Summary: Continuous sign language recognition (CSLR) is a challenging task that utilizes multiple input modalities to improve recognition accuracy. However, the modality differences make it difficult to define an integrative framework. To address this, a novel deep learning framework called CA-SignBERT is proposed, which utilizes multiple BERT models and a special cross-attention mechanism to analyze information from different modalities.
IEEE SIGNAL PROCESSING LETTERS
(2022)
Article
Engineering, Electrical & Electronic
Yiyang Teng et al.
Summary: This study proposes a pre-trained multimodal feature learning framework that trains the model on unlabeled video data through self-supervised learning, and then applies it to social relationship recognition tasks. A multimodal instance interaction transformer is designed to capture interactions between visual and textual information, while pre-training ensures state-of-the-art results on a public benchmark.
IEEE SIGNAL PROCESSING LETTERS
(2022)
Article
Engineering, Electrical & Electronic
Jinke Lin et al.
Summary: This study focuses on the multi-label classification of fundus images, proposing two new multi-label classification networks based on graph convolutional network and self-supervised learning to enhance classification performance and generalization ability by capturing relevant information and learning unannotated data.
IEEE SIGNAL PROCESSING LETTERS
(2021)
Article
Computer Science, Information Systems
Xusong Chen et al.
Summary: The paper focuses on learning and fusing multiple kinds of user interest representations, including latent representation, item-level representation, neighbor-assisted representation, and category-level representation. The proposed method is validated on two real-world video recommendation datasets, demonstrating significant performance improvement over existing state-of-the-art techniques.
IEEE TRANSACTIONS ON MULTIMEDIA
(2021)
Article
Computer Science, Information Systems
Wei Liu et al.
MULTIMEDIA TOOLS AND APPLICATIONS
(2020)
Proceedings Paper
Computer Science, Information Systems
Jiayi Xie et al.
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020)
(2020)
Article
Engineering, Electrical & Electronic
Helin Wang et al.
IEEE SIGNAL PROCESSING LETTERS
(2020)
Article
Computer Science, Artificial Intelligence
Jia Zhang et al.
PATTERN RECOGNITION
(2019)
Article
Computer Science, Artificial Intelligence
Yue Zhu et al.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2018)
Proceedings Paper
Computer Science, Artificial Intelligence
Du Tran et al.
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)
(2015)