+353-1-416-8900REST OF WORLD
1-800-526-8630U.S. (TOLL FREE)

ECC and Signal Processing Technology for Solid State Drives and Multi-bit per cell NAND Flash Memories (2nd Edition)

  • ID: 1196423
  • Report
  • January 2012
  • Region: Global
  • 176 Pages
  • Forward Insights
1 of 4
In December 2011, it was widely reported that Apple had acquired Anobit Technologies for half a billion dollars. What prompted Apple to spend that kind of money on a five-year-old startup? As the biggest consumer of NAND flash memories, this was clearly a strategic acquisition. The tie-up follows a less widely known acquisition of Storage Genetics by Micron Technology in 2010. Both Anobit and Storage Genetics were developing advanced ECC and signal processing technologies.

As bit errors increase as NAND flash memory scales below 2xnm process technology and transitions to 3-bit per cell architectures, traditional error correction codes such as BCH, RS and Hamming code will no longer be sufficient. These codes suffer from increased overhead in terms of coding redundancy and read latency as the number of errors corrected increases. In addition, the number of electrons stored in the memory cell is decreasing with each generation of flash memory resulting in reduced signal/noise requiring enhanced sensing techniques.

Digital signal processing technology has been employed in the magnetic recording industry since the early 1990's when partial-response maximum-likelihood technology (PRML) was commercialized. DSP technology is now being deployed in 3-bit per cell NAND flash memories and a concerted effort is being made by NAND flash manufacturers and a handful of startups to employ digital signal processing technology to improve the endurance and performance of next generation NAND flash memories and solid state drives. Signal processing technology will be essential for the continued scaling of NAND flash memories.

This research report examines the current state of ECC techniques and explores the technology, roadmap, market, cost as well as the key players and startups in the flash signal processing space.
Note: Product cover images may vary from those shown
2 of 4


3 of 4
List of Figures
List of Tables
Executive Summary
Hard Disk Drive Trends
The Read Channel
Peak-to-Peak Detection
Partial Response Maximum Likelihood (PRML)
Error Correction
NAND Flash Memory Overview
NAND Flash Memory Introduction
NAND Flash Architecture
NAND Flash Memory Reliability
Random Telegraph Noise
Program Noise
Dopant Fluctuation, Line Edge Roughness
Bit Errors
Program Disturb
Read Disturb
Charge Leakage
Shannon Limit
Error Correction Codes
Types of Codes
Block Codes
Convolutional Codes
Concatenated Codes
Hamming Code
BCH Code
Reed-Solomon Code
BCH and Reed-Solomon Considerations
Convolutional Codes
Trellis Coded Modulation
Concatenated TCM-BCH Coding
Turbo Codes
Low Density Parity Check (LDPC)
Representations for LDPC codes
Regular and Irregular LDPC codes
Constructing LDPC codes
Performance & Complexity
Decoding LDPC codes
Hard-decision Decoding
Soft-decision Decoding
LDPC in NAND Flash Memories
Signal Processing for Flash Memories
ECC for Flash Memories
Adaptive/Dynamic ECC Scheme
BCH and Shannon Limit
SLC Soft Detection
Test Mode Sequences
Soft Read
MLC Soft Detection
Sensing Schemes
Analog Sensing
Moving Reference
MLC Programming with ECC
Floating Gate Coupling Compensation
Fractional Reference Voltage Threshold
State Ordering
Statistics Collection
Inter-cell Interference Cancellation
Adaptive Program and Read
Data Scrambling
Asymmetric Coding
NAND Capacity Gains
Endurance Improvement
Controller Level Techniques
Competitive Landscape
Competencies of Flash Signal Processing Players
Market for Flash Signal Processing Technology
Consumer low end NAND applications (Removable storage)
Embedded Consumer NAND applications
Consumer SSD NAND applications
Enterprise SSD NAND applications
About the Authors
About Forward Insights
Report Offerings

List of Figures

Figure 1. HDD Areal Density Trend
Figure 2. HDD Performance Trend
Figure 3. Storage Pricing Trends
Figure 4. HDD Communications Model
Figure 5. Readback Signal
Figure 6. Peak Detection
Figure 7. Effect of Increased Recording Density
Figure 8. PRML Read Channel Architecture
Figure 9. Superposition of Pulses
Figure 10. Maximum Likelihood Detection
Figure 11. Partial Response Schemes
Figure 12. Eye Diagrams for PR4 and EPR4 Channels
Figure 13. Eye Diagram for EPR4 (1,7) Channel
Figure 14. Waveform and Interleaved Sequences for RLL (0,4/4)
Figure 15. Diminishing Returns on 512B Sector ECC
Figure 16. ECC Gains with Increased Sector Size
Figure 17. Signal Processing and Coding Trends
Figure 18. LDPC vs. PR4, EPR4 and NPML
Figure 19. Iterative Decoding
Figure 20. NAND Flash Cell Programming
Figure 21. Multi-level Storage in NAND Flash
Figure 22. NAND Flash Cell Reading
Figure 23. NAND Flash Cell Erase
Figure 24. NAND Cell Architecture
Figure 25. NAND Cell String
Figure 26. NAND Flash Memory Architecture
Figure 27. 8Gb NAND Flash Memory Organization
Figure 28. Cross-talk and Coupling Ratio
Figure 29. Inter-cell Interference
Figure 30. Multi-bit per Cell NAND Flash Endurance
Figure 31. Electrons Stored on the Floating Gate
Figure 32. NAND Flash Bit Size Trend
Figure 33. Vt Fluctuation due to RTN
Figure 34. Number and Amplitude of Trap Sites
Figure 35. Dependence of Traps on P/E Cycles
Figure 36. S/N in Multi-level NAND Flash Memories
Figure 37. ISPP and DVt Spread
Figure 38. Effect of Program Injection
Figure 39. sVt vs. Channel Length due to Random Discrete Dopants in the Source and Drain of Double Gate MOSFETs
Figure 40. Effect of Gate Oxide Thickness on Vt
Figure 41. Line Edge Roughness
Figure 42. Uncorrectable BER vs. Raw BER
Figure 43. Flash Error Rate Surface
Figure 44. Program Disturb
Figure 45. RBER vs. P/E Cycling
Figure 46. Read Disturb
Figure 47. RBER vs. Number of Reads
Figure 48. Effect of Number of Electrons per Bit on Retention Time
Figure 49. RBER vs. Retention
Figure 50. MLC NAND Flash Endurance & Data Retention
Figure 51. Soft Decoded Multi-bit/cell Limits and Hard Decoded 1bit/cell Limit vs, Shannon Limit
Figure 52 Code Gain and Distance Limit
Figure 53. Representation of the Hamming Coding and Decoding
Figure 54. Encoding and Decoding for Systematic Hamming
Figure 55. Representation of the matrix M systematically built on the basis of the data flow (mi) going through the memory. The sorting indexes x (1-512) and y (1-8) univocally identify each bit of the matrix M and are therefore suitable for building the parity matrix H.
Figure 56. Representation of the matrix H. The sorting indexes x (1-512) and y (1-8), univocally identify each row of the matrix H (a). For the calculation of the parity and of the syndrome, the matrix H is seen as the composition ((Y,X)T,I). The matrix H is completed by adding the necessary row and column for the code extension.
Figure 57. Representation of the matrix M systematically built on the basis of the data flow going through the memory. The sorting indexes x (0-511) and y (0-7) univocally identify each bit of the matrix M and are therefore suitable for building the parity matrix
Figure 58. Representation of the matrix H. The sorting indexes x (0-511) and y (0-7), univocally identify each row of the matrix H (a). For the parity and the syndrome calculation, the matrix H is seen as the composition ((Y,Y',X,X')T,I) (b).
Figure 59. BER/EDR for Hamming ECC
Figure 60. BCH Encoding and Decoding
Figure 61. BERout vs. Eb/N0 for a 2KB Page SLC NAND Flash Device for Hamming and BCH Code
Figure 62. Channel Capacity - Hamming Code and BCH Code for 1, 2, 3, 4 bits/cell.
Figure 63. Structure of a Reed-Solomon Code
Figure 64. R (Ratio) vs. t (Number of Error Corrected) for BCH
Figure 65. Channel Capacity - BCH vs. Reed-Solomon
Figure 66. BERout vs. Eb/N0 for a 2KB Page SLC Device for Reed-Solomon Code
Figure 67. Convolutional Encoder
Figure 68. Constraint Length 3 Convolutional Encoder State Diagram
Figure 69. Constraint Length 3 Convolutional Encoder Trellis Diagram
Figure 70. Constraint Length 3 Convolutional Encoder Trellis Diagram for Input Sequence (1 0 0 1 1 0 1 0)
Figure 71. Convolutional Code Performance (Upper Limit)
Figure 72. Convolutional code R = 1/2, k = 3, 8, 14 on Channel Capacity Plane.
Figure 73. TCM Error Correction System
Figure 74. BER Performance TCM, BCH & Hamming Code for 16-bit, 32-bit and 64-bit User Data
Figure 75. Program (a) and Read (b) in TCM Error Correction System
Figure 76. t for BCH Code
Figure 77. BER handled by BCH Code as a Function of Deviation of Vt Width Distribution s
Figure 78. Cell Storage Efficiency
Figure 79. Basic Turbo Encoder
Figure 80. Basic Turbo Code Decoder
Figure 81. Performance of Turbo Code. BER given by Iterative Decoding (p = 1,..18) at a rate R = 1/2 Encoder, Memory v = 4, generators G1 = 37, G2 = 21, with interleaving 256 x 256.
Figure 82. LDPC Performance as a Function of Block Length
Figure 83. Tanner Graph of Parity Check Matrix
Figure 84. Overview over messages received and sent by the c-nodes in step 2 of the message passing algorithm
Figure 85. Step 3 of the described decoding algorithm. The v-nodes use the answer messages from the c-nodes to perform a majority vote on the bit value.
Figure 86. a) Illustrates the calculation of rji(b) and b) qij(b)
Figure 87. Approaching the Shannon Limit
Figure 88. Page Size + Spare Area Trend
Figure 89. Codewords of Four Operation Modes for Adaptive BCH ECC
Figure 90. Dynamic ECC Codeword Transition Scheme
Figure 91. Dynamic ECC codeword transition by (a) monitoring the number of errors (b) monitoring the number of W/E cycles
Figure 92. Shannon Limit and BCH ECC for Multi-level NAND Flash Memories
Figure 93. SLC Soft Detection
Figure 94. SLC Soft Decoding (5 – 3 Level) with LDPC vs. Hard Decision + BCH 32-bit correction
Figure 95. SLC vs. MLC Soft Decoding Limits
Figure 96. DSM Sensing
Figure 97. Linear Dependency Between N and Icell in the DSM Page Buffer Architecture
Figure 98. 4-Level Distribution with References R1, R2 and R3
Figure 99. Widened 4-Level Distribution with References
Figure 100. Shifted 4-Level Distribution with References
Figure 101. Effect of Read Disturb on 4-Level Distribution
Figure 102. Effect of Temperature on Vt Distribution
Figure 103. Moving Reference Algorithm
Figure 104. Moving Reference ECC
Figure 105. Program Page Sequence to Minimize FG Coupling
Figure 106. WL0 after Programming of Page 0
Figure 107. WL0 after Programming of Page 1
Figure 108. Marginal cell can be moved to the wrong distribution (red arrow) if an error occurs in the initial read inside the program flow
Figure 109. 8-level and 16-level Programming
Figure 110. Progamming Sequence for 4bit/cell NAND flash and ECC Activity
Figure 111. Observed Cell and Neighbouring Cells
Figure 112. Observed Cell Programmed at 1V and Neighboring Cells Programmed at -2V.
Figure 113. Vt Shift due to FG Coupling
Figure 114. Fractional Reference Vt for 8-level Cell Memory
Figure 115. Serially Ordered 16-level Cell Memory
Figure 116. 16-level Cell Memory Ordering for Optimizing Bit Errors
Figure 117. Pattern Dependency of Data Retention and Program Disturb Errors
Figure 118. Proportion of the data retention error to the program disturb error
Figure 119. Asymmetric Coding
Figure 120. Improvement in Data Retention Error due to Asymmetric Coding
Figure 121. Probability Processing vs. Digital Processing
Figure 122. LEC Footprint
Figure 123. Performance of LDPC on 3 Bits / Cell Flash Memory
Figure 124. DensBits ECC under Hard Decoding
Figure 125. DensBits ECC under Soft Decoding
Figure 126. DensBits Hard and Soft Decoding vs. Hamming
Figure 127. DensBits Soft Decoding vs. BCH 24-bit/1kB
Figure 128. Reverse Concatentation Read Channel Coding
Figure 129. Marvell Flash Read Channel based on Conventional HDD Read Channel
Figure 130. Marvell Flash Read Channel
Figure 131. Block Diagram of Controller and Flash Memory
Figure 132. Effect of Storage Genetics Processing Engine
Figure 133. Storage Genetics Correction Engine - 4 bit/cell NAND
Figure 134. ClearNAND
Figure 135. HB and SB Decoding Process
Figure 136. OLEA-BCH vs. Normal BCH Architecture, 8kB page size; 8KB, BSC channel, random error model; 8 BCH blocks per page; Code Rate: 0.9609, 24bits/1024B
Figure 137. PAC LDPC vs. Normal LDPC Architecture, AWGN channel; Code Rate: 0.9362; 3-bit soft information; min-sum algorithm; 8 iterations
Figure 138. Weight Control Code, 4kB Page Size; 15/16bit / 512Byte BCH; Raw ER : 1.8 X 10-3
Figure 139. Maximum Frequency and Power Consumption vs. Supply Voltage
Figure 140. a) Normalized Voltage Supply vs. Parallelism, b) BER vs. Maximum Number of Iterations under 4-bit quantized minsum decoding - Reed-Solomon based (6, 32)-regular 2048-bit LDPC code
Figure 141. LDPC Decoder Die Photo
Figure 142. A Parallel RS-LDPC (2048,1723) for 10GBASE-T Ethernet
Figure 143. Applications Utilizing NAND Flash
Figure 144. Flash Signal Processing Cost Adder
Figure 145. NAND Flash Solution by Implementation
Figure 146. TAM for Advanced ECC/FSP Chips

List of Tables

Table 1. Capacities of (0,G/I)
Table 2. Comparison of Detection Methods
Table 3. TCM vs. Hamming & BCH
Table 4. Silicon Area (mm2)
Table 5. ECC Requirements for Multi-level NAND Flash Memories
Table 6. Adaptive ECC Scheme Performance
Table 7. Capacity Increase – LDPC vs. BCH
Table 8. Endurance Increase – LDPC vs. BCH
Table 9. Controller-level Techniques
Table 10. Planned Deployment of DSP Techniques by NAND Flash Vendor
Table 11. Planned Product Deployment of NAND Flash Vendors Employing DSP Techniques
Table 12. Competencies of Flash Signal Processing Players
Table 13. Coding Gain
Table 14. DensBits ECC Hardware Implementation
Table 15. Comparison of DBECC, LDPC and BCH
Table 16. Endurance Specifications with ECC Engine
Table 17. EZ-NAND Architecture vs. Traditional Architecture
Table 18. LDPC Error Correction Capability
Table 19. Siglead SSD Controllers
Table 20. Comparison of LDPC Decoders
Table 21. Core Area for 2.5 million Gates
Note: Product cover images may vary from those shown
4 of 4
Note: Product cover images may vary from those shown