4.6 Article

Optimizing Multimodal Scene Recognition through Mutual Information-Based Feature Selection in Deep Learning Models

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

Multiscale Feature Extraction and Fusion of Image and Text in VQA

Siyu Lu et al.

Summary: The Visual Question Answering (VQA) system aims to extract useful information from images related to a given question in order to answer the question accurately. It has a wide range of applications in visual assistance, automated security surveillance, and intelligent robotics-human interaction. However, the accuracy of VQA has been unsatisfactory, mainly due to the limitations in representing scene and object information in image features and fully utilizing text information. This paper proposes the use of multi-scale feature extraction and fusion methods to address these challenges and improve the accuracy of the VQA model. Experimental results demonstrate that the incorporation of multi-scale feature extraction and fusion techniques enhances the performance of the VQA model.

INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS (2023)

Article Chemistry, Analytical

A Multi-Attention Approach for Person Re-Identification Using Deep Learning

Shimaa Saber et al.

Summary: This paper presents a novel approach for person re-identification by introducing a multi-part feature network that combines the position attention module (PAM) and the efficient channel attention (ECA). The proposed method outperforms existing state-of-the-art methods on publicly available person re-ID datasets. The results demonstrate the effectiveness and potential of the suggested method in computer vision applications.

SENSORS (2023)

Article Social Sciences, Interdisciplinary

Adapting Feature Selection Algorithms for the Classification of Chinese Texts

Xuan Liu et al.

Summary: This study proposes three improved feature selection algorithms for Chinese texts and tests their performance on different types of corpora. The experimental results demonstrate that these algorithms can improve the accuracy of text classification.

SYSTEMS (2023)

Article Computer Science, Information Systems

Multi-Modal CNN Features Fusion for Emotion Recognition: A Modified Xception Model

H. M. Shahzad et al.

Summary: Facial expression recognition (FER) is advancing with the help of multimodal approaches that incorporate data from various modalities, such as voice expressions. This paper proposes a novel multimodal methodology based on deep learning to effectively recognize facial expressions under masked conditions. Experimental evaluations demonstrate that the proposed approach outperforms conventional unimodal methods, achieving an accuracy of 79.81% compared to the unimodal technique's 68.81% accuracy.

IEEE ACCESS (2023)

Article Computer Science, Information Systems

Deep Learning-Based Object Detection and Scene Perception under Bad Weather Conditions

Teena Sharma et al.

Summary: This study presents a method for intelligent traffic monitoring using the YOLOv5 model, which allows real-time identification of vehicles, pedestrians, and traffic signals in different weather conditions. The results show that the proposed approach can accurately recognize and track objects on the road in various situations.

ELECTRONICS (2022)

Article Chemistry, Analytical

S-MAT: Semantic-Driven Masked Attention Transformer for Multi-Label Aerial Image Classification

Hongjun Wu et al.

Summary: Multi-label aerial scene image classification is a challenging problem in remote sensing. We propose a semantic-driven masked attention transformer (S-MAT) method that accurately captures label dependencies and improves classification performance.

SENSORS (2022)

Article Computer Science, Information Systems

Graph convolutional network with triplet attention learning for person re-identification

Shimaa Saber et al.

Summary: Person re-identification is a method that uses multiple non-overlapping cameras for identification, and it has been successfully applied in computer vision applications. To address issues such as occlusion, illumination changes, and pose changes, a new graph convolutional network with attention modules is proposed. Experimental results demonstrate the high generalization ability and superior performance of the proposed method.

INFORMATION SCIENCES (2022)

Article Computer Science, Software Engineering

Joint Computational Design of Workspaces and Workplans

Yongqi Zhang et al.

Summary: The study proposes an automatic approach to jointly design workspaces and workplans by optimizing performance and workload factors, generating Pareto-optimal design solutions for different work scenarios. Evaluation experiments validate the efficacy of the approach in synthesizing effective workspaces and workplans.

ACM TRANSACTIONS ON GRAPHICS (2021)

Article Environmental Sciences

Vision Transformers for Remote Sensing Image Classification

Yakoub Bazi et al.

Summary: This paper proposes a remote-sensing scene-classification method based on vision transformers, which utilize multihead attention mechanisms to establish long-range contextual relationships between pixels in images. The approach involves dividing images into patches, converting them into sequences, and applying data augmentation techniques for improved classification performance. The study also demonstrates the efficacy of compressing the network by pruning half of the layers while maintaining competitive classification accuracies.

REMOTE SENSING (2021)

Article Geography, Physical

Aerial scene understanding in the wild: Multi-scene recognition via prototype-based memory networks

Yuansheng Hua et al.

Summary: This paper proposes a method for recognizing multiple scenes in a single image by leveraging prototype learning, external memory, and multi-head attention mechanism. Experimental results demonstrate the effectiveness of this approach in aerial scene recognition.

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING (2021)

Article Engineering, Electrical & Electronic

CSDS: End-to-End Aerial Scenes Classification With Depthwise Separable Convolution and an Attention Mechanism

Xinyu Wang et al.

Summary: This article proposes a channel-spatial attention mechanism based on a depthwise separable convolution (CSDS) network for aerial scene classification. Experimental results on three public datasets show that the CSDS network achieves comparable performance to other state-of-the-art methods. Visualization of feature extraction results and ablation experiments demonstrate the powerful feature learning and representation capabilities of the proposed CSDS network.

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING (2021)

Article Computer Science, Information Systems

Joint Cross-Modal and Unimodal Features for RGB-D Salient Object Detection

Nianchang Huang et al.

Summary: The study proposes a novel RGB-D salient object detection model that effectively combines cross-modal features from RGB-D images and unimodal features from RGB and depth images, achieving significant performance improvement on four benchmark datasets compared to state-of-the-art methods.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Information Systems

Intelligent Scene Recognition Based on Deep Learning

Sixian Wang et al.

Summary: This study proposes a real-time scene recognition model based on long short-term memory classifiers, utilizing lightweight sensors and a lower sampling rate, significantly reducing recognition latency through a two-stage setting and real-time processing techniques.

IEEE ACCESS (2021)

Article Computer Science, Artificial Intelligence

Deep Feature Fusion for High-Resolution Aerial Scene Classification

Heng Wang et al.

NEURAL PROCESSING LETTERS (2020)

Article Environmental Sciences

Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis

Rafael Pires de Lima et al.

REMOTE SENSING (2020)

Article Computer Science, Artificial Intelligence

Deep Learning Based Application for Indoor Scene Recognition

Mouna Afif et al.

NEURAL PROCESSING LETTERS (2020)

Article Computer Science, Artificial Intelligence

Scene recognition: A comprehensive survey

Lin Xie et al.

PATTERN RECOGNITION (2020)

Article Chemistry, Multidisciplinary

Aerial Scene Classification through Fine-Tuning with Adaptive Learning Rates and Label Smoothing

Biserka Petrovska et al.

APPLIED SCIENCES-BASEL (2020)

Proceedings Paper Computer Science, Information Systems

Vehicle Detection from Multi-modal Aerial Imagery using YOLOv3 with Mid-level Fusion

Mayur Dhanaraj et al.

BIG DATA II: LEARNING, ANALYTICS, AND APPLICATIONS (2020)

Article Computer Science, Information Systems

Multi-modal multi-concept-based deep neural network for automatic image annotation

Haijiao Xu et al.

MULTIMEDIA TOOLS AND APPLICATIONS (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Scene Understanding: A Survey to See the World at a Single Glance

Prajakta Ganesh Pawar et al.

2019 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT COMMUNICATION AND COMPUTATIONAL TECHNIQUES (ICCT) (2019)

Article Computer Science, Artificial Intelligence

Mutual information-based feature selection for multilabel classification

Gauthier Doquire et al.

NEUROCOMPUTING (2013)

Article Computer Science, Artificial Intelligence

Measuring relevance between discrete and continuous features based on neighborhood mutual information

Qinghua Hu et al.

EXPERT SYSTEMS WITH APPLICATIONS (2011)

Article Computer Science, Information Systems

Registration Based on Scene Recognition and Natural Features Tracking Techniques for Wide-Area Augmented Reality Systems

T. Guan et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2009)

Review Computer Science, Artificial Intelligence

A survey of advances in vision-based human motion capture and analysis

Thomas B. Moeslund et al.

COMPUTER VISION AND IMAGE UNDERSTANDING (2006)