期刊
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
卷 27, 期 1, 页码 29-46出版社
WILEY
DOI: 10.1002/cpe.3193
关键词
LDPC; SDR; CUDA; GPU
资金
- National Natural Science Foundation of China [61125201]
Because layered low-density parity-check (LDPC) decoding algorithm was proposed, one can exploit the diversity gain to achieve performance comparable to the traditional two-phase message passing (TPMP) decoding but with about twice faster decoding convergence compared to TPMP. In order to reduce the decoding time of layered LDPC decoder, a graphics processing unit (GPU) is exploited as the modem processor so that the decoding procedure can be processed in parallel using numerous threads in the GPU. In this paper, we present the parallel algorithms and efficient implementations on the GPU for two different layered message passing schemes, the row-layered and column-layered decoding. In the experiments, the quasicyclic LDPC codes for WiFi (802.11n) and WiMAX (802.16e) are decoded by the proposed layered LDPC decoders. The experimental results show that our decoder has good bit error ratio (BER) performance comparable to TPMP decoder. The peak throughput is 712 Mbps, which is about two orders of magnitude faster than that of CPU implementation and comparable to the dedicated hardware solutions. Compared to the existing fastest GPU-based implementation, the presented decoder can achieve a performance improvement of 2.3 times. Copyright (C) 2013 John Wiley & Sons, Ltd.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据