ArsTechnica

The SSD Revolution / An Ars Technica feature

The future of flash memory: tiny (and extremely tough to build)

As solid state memory nears theoretical limits, engineers look beyond flash.

In the past few weeks, we've looked at how solid state disks work, discussing the practical effects of reduced latency on the computing experience and walking through a bit about the nature of NAND flash and floating gate transistors. We've also talked about flash's impact on the overall direction of the mobile space and some ways in which modern operating systems are adapting to SSDs. Last week, we asked you all to tell us about what SSDs do for you, both at home and at work. This week, in the fourth and final entry in our feature series on the solid state revolution, we turn our gaze to the future. What's ahead for flash?

Flash memory continues to shrink in size and grow in capacity. Hard disk drive technology continues its inevitable march toward greater areal densities, and hybrid drives are being purchased in greater numbers. Hewlett-Packard is busy at work on a new type of storage, one based on fancy little things called memristors, which may hit the market in the mid-term.

There's a lot happening with solid state storage, and a lot more set to happen—but some serious problems need to be solved first.

It's getting small in here

For electronics, smaller is almost always better. Moore's Law—which says that the amount of transistors one can cram into a given amount of space tends to double about every eighteen months—still holds roughly true for NAND flash. SSDs based on a 25nm or 20nm manufacturing process are common today, and in early June, Toshiba announced a line of SSDs based on a 19nm process size.

The advantages of smaller flash are huge, since material costs money—and more flash can go on a single chip, which most obviously means bigger drive sizes. Even better, smaller floating gate transistors use less electricity and operate more efficiently, and gains in power efficiency are extremely important for flash memory's most important growth area: the mobile market.

When coupled with the eventual switch to TLC (triple-level cell) flash over the current MLC (multi-level cell) flash, the future should look rosy: physically smaller, higher-capacity SSDs which use less power. We can't lose!

NAND process sizes from various manufacturers from 2007 through 2011

Resistance becomes futile

Ah, but things are never that simple. SSDs have that proverbial Achilles' heel: the more writes their cells experience, the closer those cells get to dying.

As we discussed at length in the first article in this series, a flash cell is a specific type of transistor called a floating gate transistor. It has a specific feature (the actual floating gate) into which electrons can be stuffed to alter the transistor's threshold voltage, which is the potential difference that must be applied to the transistor in order for it to conduct electricity. The presence or absence of charge in the floating gate, and hence the existence of a high or low voltage threshold, determines whether or not the flash cell holds a 0 or a 1.

However, the process of coaxing electrons back out of the floating gate once they've been coaxed into it requires quite a bit of voltage, and some electrons are left behind in the floating gate every time. These extra electrons aren't a problem at first, but over the life of the flash cell, their presence can substantially alter the cell's electrical resistance. When a cell gets written to, it must be able to pull electrons into its floating gate in a very small amount of time; as its resistance becomes greater, the controller has to use greater and greater amounts of current to get the electrons to jump into the cell. Eventually, the amount of current required becomes so high, and the amount of time it takes for the electrons to jump into the cell becomes so long, that the cell can't be written to anymore.

As flash cell process sizes decrease and the cells themselves get smaller, we still face that fundamental problem. Worse, it's a problem that shrinking manufacturing process size actually makes worse. As flash cells get smaller, they retain a commensurately smaller amount of residual charge before they must be marked by the controller as useless. This dampens the enthusiasm for smaller flash chips quite a bit.

The smaller the flash cells get, the less residual charge it takes before they need to be marked as bad.

The number of bits stored in a flash cell makes a difference here, too. The most basic and reliable form of flash, as we've previously discussed, is SLC (Single Level Cell) flash, which stores a single bit in each floating gate. If the gate holds a charge, it represents a 0. If the gate holds little or no charge, it represents a 1. SLC flash is fast to read from and write to, since the cells just have to report on whether or not they contain a charge. More importantly, SLC flash cells can sustain the greatest number of writes, because pumping electrons into and out of the cell doesn't have to be done with great finesse and attention to differing charge levels. The entire process is less sensitive to the cell's accumulated charge.

Multi-Level Cell (MLC) flash stores two bits to SLC's one, and it does this by having not just "no charge" versus "some charge," but rather by having four discrete voltage levels—little to no charge for 11, some charge for 10, a bit more charge for 01, and even more charge for 00. With MLC, the flash cell becomes a lot more sensitive to changes in its resistance, because read and write operations have to be carried out more carefully—it's not just "pump in some charge," but instead pump in specific amounts of charge.

Additionally, the increased density of MLC guarantees that the individual NAND cells will undergo more writes for the same amount of data. The MLC drive has fewer NAND flash transistors for a given amount of storage, so a workload run on a 100GB SLC disk and on a 100GB MLC disk will make the MLC disk work harder; there are simply fewer transistors to bear the write load. The difference is dramatic, too. The average cell life in an SLC SSD is 100,000 writes, while a good 20nm MLC backed by a good controller will bear perhaps 3,000.

Can the situation get even crazier in future? It certainly can. TLC flash—that's "Triple Level Cell"—cranks up the density even further, storing three bits in each cell. This is done by keeping track of eight different voltage levels per cell—000, 001, 010, 011, 100, 101, 110, and 111. TLC cells have been around for quite some time but aren't yet used in consumer SSDs because there are significant engineering challenges in making them work well. For one thing, error correction algorithms need to be modified; for another thing, they're currently only good for a few hundred program/erase cycles. In order for TLC flash to be viable in consumer devices, SSD controllers will need to be incredibly stingy with writes and will have to take extraordinary measures to keep write amplification to a minimum.

TLC flash should enable bigger SSDs at lower costs, but the engineering challenges are formidable. And it gets worse when we consider the possible limits to flash technology.

Listing image by Chris Foresman/Ars Technica

Reader Comments

There was a problem fetching the comments! Please try viewing them directly in the OpenForum