4.8 Article

An attention-based hybrid deep learning approach for bengali video captioning

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Computer Science, Artificial Intelligence

An attention based dual learning approach for video captioning

Wanting Ji et al.

Summary: Video captioning is an important task in multimedia processing, and traditional approaches only utilize visual information to generate captions. This paper proposes a novel attention based dual learning approach (ADL) that improves the quality of video captions by minimizing the differences between generated and raw videos.

APPLIED SOFT COMPUTING (2022)

Article Computer Science, Information Systems

Robust regularization for single image dehazing

Usman Ali et al.

Summary: This paper proposes an improved image dehazing method by optimizing a nonconvex energy function that leverages structural information from the transmission map and guidance. The proposed method provides robust regularization and achieves high-quality haze-free images.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2022)

Proceedings Paper Computer Science, Artificial Intelligence

SWINBERT: End-to-End Transformers with Sparse Attention for Video Captioning

Kevin Lin et al.

Summary: This paper presents SWINBERT, an end-to-end transformer-based model for video captioning, which directly takes video frame patches as inputs and outputs natural language descriptions. It shows that video captioning can benefit significantly from more densely sampled video frames and proposes adaptively learning a sparse attention mask for better long-range video sequence modeling. Extensive experiments demonstrate the performance improvements of SWINBERT over previous methods and the effectiveness of the learned attention masks.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Computer Science, Artificial Intelligence

Towards achieving a delicate blending between rule-based translator and neural machine translator

Md Adnanul Islam et al.

Summary: Popular translators excel in translating among high-resource languages but may make mistakes when translating low-resource languages. The study aims to improve translation from Bengali to English by exploring different blending approaches. Rigorous experimentation is conducted to compare the performance of different translation approaches.

NEURAL COMPUTING & APPLICATIONS (2021)

Article Computer Science, Artificial Intelligence

Human action recognition using two-stream attention based LSTM networks

Cheng Dai et al.

APPLIED SOFT COMPUTING (2020)

Article Computer Science, Information Systems

A new hybrid deep learning model for human action recognition

Neziha Jaouedi et al.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2020)

Article Automation & Control Systems

Describing Video With Attention-Based Bidirectional LSTM

Yi Bin et al.

IEEE TRANSACTIONS ON CYBERNETICS (2019)

Article Computer Science, Artificial Intelligence

Video Captioning by Adversarial LSTM

Yang Yang et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2018)

Proceedings Paper Computer Science, Information Systems

Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction

Marco Basaldella et al.

DIGITAL LIBRARIES AND MULTIMEDIA ARCHIVES, IRCDL 2018 (2018)

Article Computer Science, Information Systems

Video Captioning With Attention-Based LSTM and Semantic Consistency

Lianli Gao et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2017)