4.8 Article

Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems

Journal

IEEE INTERNET OF THINGS JOURNAL
Volume 10, Issue 3, Pages 2245-2254

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2021.3128440

Keywords

Depth-guided local convolution; joint training; monocular 3-D object detection; smart payment; soft-non-maximum suppression (soft-NMS)

Ask authors/readers for more resources

This article proposes a monocular 3-D object detection method based on depth-guided local convolution, which combines the information of RGB image mode and depth mode by using a convolution kernel through depth image and works on a single RGB image locally. The method adaptively adjusts the convolution kernel to capture target objects of different scales based on multiscale input information, thereby improving the performance of 3-D object detection. In addition, a soft non-maximum suppression algorithm is used to select the best prediction box, and the depth estimation network and 3-D object detection network are jointly trained to achieve the best performance.
3-D object detection from mobile phones in Device-to-Device (D2D) system provides a new smart payment tool for the next generation of fintech, which is more flexible and efficient than the traditional barcode. In this article, we propose a monocular 3-D object detection method based on depth-guided local convolution. The method combines the information of RGB image mode and depth mode by using a convolution kernel through depth image and works on a single RGB image locally. According to the multiscale input information, the convolution kernel is adaptively adjusted to capture the target objects of different scales, so as to improve the performance of 3-D object detection. In addition, we use the soft-non-maximum suppression algorithm instead of traditional non-maximum suppression to select the best prediction box. In order to further improve the accuracy of 3-D object detection, the depth estimation network and 3-D object detection network are jointly trained in this method to make the two networks constrain each other and achieve the best performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available