4.6 Article

An Early-Life NAND Flash Endurance Prediction System

Journal

IEEE ACCESS
Volume 9, Issue -, Pages 148635-148649

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3124604

Keywords

Logic gates; Ash; Error correction codes; Threshold voltage; Voltage control; Three-dimensional displays; Reliability; NAND flash; solid state drives; endurance; retention; machine learning; gradient boosting; classification; prediction

Ask authors/readers for more resources

This research demonstrates that a sector's true endurance can be accurately predicted by combining its location within the device with measurements taken at the beginning of life. Optimized machine learning classification models can prevent ECC failures and data loss, while also achieving significant endurance extensions.
NAND flash memory - ubiquitous in today's world of smart phones, SSDs (solid state drives), and cloud storage - has a number of well-known reliability problems. NAND data contains bit errors, which require the use of error correcting codes (ECCs). The raw bit error rate (RBER) increases with program-erase (P-E) cycling, and the number of P-E cycles the device can withstand before the RBER exceeds the ECC capability is called its endurance. ECC operates on data stored in a sector of NAND, and there is a large variation in the endurance of sectors within a device and across devices, resulting in excessively conservative endurance specifications. This research shows, for the first time, that a sector's true endurance can be predicted with remarkable accuracy, using a combination of the sector's location within the device, and measurements taken at the very beginning of life. Real-world data is gathered on millions of NAND sectors using a custom-built test platform. Optimised machine learning classification models are built from the raw data to predict if a sector will pass or fail to a fixed ECC threshold, after a target P-E cycling level has been reached. A novel technique is demonstrated that uses different ECC thresholds for model training and testing, which allows the models to be tuned so that they never misclassify samples that would fail. This eliminates ECC failures and data loss, allowing simpler, less expensive ECC schemes to be used for modern NAND devices. It also enables significant endurance extensions to be achieved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available