4.7 Article

A 4.29nJ/pixel Stereo Depth Coprocessor With Pixel Level Pipeline and Region Optimized Semi-Global Matching for IoT Application

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSI.2021.3100071

关键词

Field programmable gate arrays; Computer architecture; Power dissipation; Pipelines; Stereo vision; Internet of Things; Hardware; Regional optimization; stereo vision; semi-global matching; real-time; FPGA

资金

  1. Shenzhen Science and Technology Innovation Commission [JSGG20200102162401765]

向作者/读者索取更多资源

This paper proposes a hardware-oriented SGM algorithm with pixel-level pipeline and region-optimized cost aggregation for high-speed processing and low hardware-resource usage. The algorithm is demonstrated on low-cost XILINX Spartan-7 and advanced Stratix-V FPGA devices for VGA depth estimation, achieving high processing speeds and energy efficiency.
The semi-global matching (SGM) algorithm in stereo vision is a well-known depth-estimation method since it can generate dense and robust disparity maps. However, the real-time processing and low power dissipation, the specifications of the Internet-of-Thing (IoT) applications, are challenging for their computational complexity. In this paper, we propose a hardware-oriented SGM algorithm with pixel-level pipeline and region-optimized cost aggregation for high-speed processing and low hardware-resource usage. Firstly, the matching costs in a region are integrated with an optimization strategy to significantly reduce memory usage and improve the processing speed of the cost aggregation. Then, a two-layer parallel two-stage pipeline (TPTP) architecture, which enables pixel-level processing, is designed to calculate two directions (0 degrees and 135 degrees) aggregation to further solve the crucial computational bottleneck of the SGM algorithm. Finally, the architecture is demonstrated on a low-cost XILINX Spartan-7 device and an advanced Stratix-V FPGA device for VGA (640x 480) depth estimation. The experimental results show that the proposed architecture with compact hardware architecture also ensures accuracy. The pixel-level pipeline architecture enables a processing speed of 355 frames per second (fps) at 109MHz on the Spartan-7 FPGA device and 508 fps at 156MHz on the Stratix-V FPGA. Besides, the coprocessor respectively achieves an energy efficiency of 4.74 nJ/pixel with a power dissipation of 517mW and 4.29nJ/pixel with a power dissipation of 669mW on these two FPGAs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据