Journal
IEEE COMMUNICATIONS LETTERS
Volume 22, Issue 10, Pages 2004-2007Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LCOMM.2018.2866566
Keywords
DNA data storage; constrained coding; channel coding
Categories
Funding
- SUTD-MIT IDC research grant
- Singapore Ministry of Education Academic Research Fund Tier 2 [MOE2016-T2-2-054]
- SUTD-ZJU grant [ZJURP1500102]
- NSFC [61750110529]
Ask authors/readers for more resources
We propose a coding method to transform binary sequences into DNA base sequences (codewords), namely sequences of the symbols A, T, C, and G, that satisfy the following two properties: 1) run-length constraint: the maximum run-length of each symbol in each codeword is at most three and 2) GC-content constraint: the GC-content of each codeword is close to 0.5, say between 0.4 and 0.6. The proposed coding scheme is motivated by the problem of designing codes for DNA-based data storage systems, where the binary digital data is stored in synthetic DNA base sequences. Existing literature either achieve code rates not greater than 1.78 bits per nucleotide or lead to severe error propagation. Our method achieves a rate of 1.9 bits per DNA base with low encoding/decoding complexity and limited error propagation.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available