4.2 Article

Highly reliable and efficient encoding systems for hexadecimal polypeptide-based data storage

Journal

FUNDAMENTAL RESEARCH
Volume 3, Issue 2, Pages 298-304

Publisher

KEAI PUBLISHING LTD
DOI: 10.1016/j.fmre.2021.11.030

Keywords

Biomaterial; Polypeptide; Data storage; Hexadecimal; Encoding system

Ask authors/readers for more resources

By developing reliable and highly efficient encoding systems (RABSR and RAHRSR), high-density information storage for polypeptide data can be achieved, with advantages such as data compression, error correction, elimination of homopolymers, and pseudo-randomized encryption. The coding efficiency for audio, picture, and text data was 3.20, 3.12, and 3.53 Bits/AA for the RABSR system, and 4.89, 4.80, and 6.84 Bits/AA for the RAHRSR system, respectively. The coding efficiency further increased to 7.24, 7.11, and 9.82 Bits/AA with redundancy for error correction and arithmetic compression for the RAHRSR system. Thus, these hexadecimal polypeptide-based systems may provide a new scenario for highly reliable and highly efficient data storage.
Polypeptides consisting of amino acid (AA) sequences are suitable for high-density information storage. However, the lack of suitable encoding systems, which accommodate the characteristics of polypeptide synthesis, storage and sequencing, impedes the application of polypeptides for large-scale digital data storage. To address this, two reliable and highly efficient encoding systems, i.e. RaptorQ-Arithmetic-Base64-Shuffle-RS (RABSR) and RaptorQ-Arithmetic-Huffman-Rotary-Shuffle-RS (RAHRSR) systems, are developed for polypeptide data storage. The two encoding systems realized the advantages of compressing data, correcting errors of AA chain loss, correcting errors within AA chains, eliminating homopolymers, and pseudo-randomized encrypting. The coding efficiency without arithmetic compression and error correction of audios, pictures and texts by the RABSR system was 3.20, 3.12 and 3.53 Bits/AA, respectively. While that using the RAHRSR system reached 4.89, 4.80 and 6.84 Bits/AA, respectively. When implemented with redundancy for error correction and arithmetic compression to reduce redundancy, the coding efficiency of audios, pictures and texts by the RABSR system was 4.43, 4.36 and 5.22 Bits/AA, respectively. This efficiency further increased to 7.24, 7.11 and 9.82 Bits/AA by the RAHRSR system, respectively. Therefore, the developed hexadecimal polypeptide-based systems may provide a new scenario for highly reliable and highly efficient data storage.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available