4.6 Article

STCN-Net: A Novel Multi-Feature Stream Fusion Visibility Estimation Approach

期刊

IEEE ACCESS
卷 10, 期 -, 页码 120329-120342

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3218456

关键词

Visibility estimation; CNN; Swin-T; multi-feature stream; DDT matrix

向作者/读者索取更多资源

This paper proposes a novel end-to-end framework named STCN-Net for visibility estimation, combining engineered features and learned features to achieve higher accuracy. The method uses a new 3D multi-feature stream Matrix, called DDT, which integrates CNN and Transformer for visibility estimation. Experimental results show that the method outperforms classical methods on two datasets.
Low visibility always leads to serious traffic accidents worldwide, although extensive works are studied to deal with the estimation of visibility in meteorology areas, it is still a tough problem. Deep learning-based visibility estimation methods, it has low accuracy due to lacking specific features of the foggy images. Meanwhile, physical model-based visibility estimation methods are only applicable to some specific scenes due to its high requirements for extra auxiliary parameters. Therefore, This paper proposes a novel end-to-end framework named STCN-Net for visibility estimation, which combined the engineered features and learned features to achieve higher accuracy. Specifically, a novel 3D multi-feature stream Matrix, named DDT, is designed for visibility estimation, which is consisted of a transmittance matrix, a dark channel matrix, and a depth matrix. Unlike traditional deep learning methods which only use convolutional neural networks (CNN) to deal with the input data or images, our method combines CNN and Transformer to process the input data or images. In STCN-Net, Swin-Transformer(Swin-T) module takes the original image as input while the CNN module takes the DDT matrix as input. Moreover, in order to integrate different feature information from the CNN and Swin-T, we embed a Coordinate Attention (CA) module in STCN-Net. Finally, two visibility datasets: Visibility Image Dataset I (VID I) and Visibility Image Dataset II (VID II) were constructed for evaluation where VID I is a real scene visibility dataset and VID II is a synthetic visibility dataset. The experimental results show that our method has better performance than classical methods on the two datasets. And compared with the runner-up, it has 2.1% more accuracy in VID I and 0.5% in VID II.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据