Skip to main content
Over recent years there has been a growing interest in flash storage that features faster data access and faster processing speed than hard disk drives (HDDs). However, legacy solid-state drives (SSDs) have been more costly than HDDs and came with some problems concerning reliability. This has encouraged Hitachi, Ltd. to produce flash storage products on its own. The following interview is about Hitachi's low cost, high performance, and high reliability flash storage technologies.
KOSEKIYes. Over recent years, storage systems for enterprise applications were mainly required to achieve reliability of data and fast responsiveness. However, in order to fulfill faster processing in storage systems using HDDs, you need several hundreds to thousands of HDD units. Such storage systems are equipped with controllers to control many HDDs, but it has been no easy task to make the HDDs perform to their full capacity, as the performance of individual HDD units is rather low.
That is why attention is given to SSDs, which incorporate flash memory chips, as a new device that should break the performance limits of HDDs. SSDs and other types of flash storage can perform over a hundred to a thousand times more than HDDs. By incorporating flash storage, the number of HDDs can be decreased by 99% from the several thousand units that would otherwise be required.
KOSEKITrue, flash storage is more expensive when talking about a single unit. However, it is what customers require for performance, which determines what is more beneficial, whether to buy a storage system incorporating hundreds to thousands of HDDs or buy a system incorporating several flash storage units.
We believe that flash storage gives advantages to customers in terms of reducing the initial cost and decreasing the physical space otherwise required to install thousands of HDD units.
SUZUKIWe wanted to provide customers with new value that legacy SSDs cannot offer in terms of performance, reliability and price. That was what drove us to develop our products.
In general, SSDs are designed to be usable for various applications, which makes them somewhat expensive. In contrast, Hitachi has accumulated knowledge, through developing storage equipment over many years, on how storage systems are used in the real world. We thought that, by utilizing our knowledge in equipment reliability design, we would be able to create products that are more cost-competitive than the offerings of other companies.
KOSEKIThe Hitachi Accelerated Flash (HAF) storage represents flash modules that have reduced cost per bit while realizing large capacity, fast speed and high reliability by incorporateing the flash memory controllers originally developed by Hitachi. At present, the modules are installed in the Hitachi Virtual Storage Plantform (VSP, enterprise-class disk storage systems) and the Hitachi Unified Storage VM (HUS VM) as their storage devices.
A major characteristic of the HAF is that Hitachi produces the controllers and the HAF modules for storage equipment in an integrated manner. This has allowed us to provide new functions that are performed in coordination with superordinate storage controllers produced by Hitachi. We consider this as a major strength of Hitachi that no other vendors can match.
Photo 1: Hitachi Accelerated Flash module
SUZUKIThere are two types in the flash memory devices that have been used for SSDs. They are single-level cells (SLCs) , which are more expensive but feature higher reliability and performance, and multi-level cells (MLCs), which are less expensive and have higher storage density. Conventional storage systems for enterprise use have adopted SLCs, raising the cost. Hitachi has chosen to use MLCs to successfully double the storage capacity of SSDs and make them less expensive.
KOSEKIMoreover, we used a package that is more than twice the size of that for typical HDDs or SSDs and installed a large number of low-priced flash memory chips onto it, while minimizing the volume of hardware other than flash memory chips to be installed onto the HAF module.
KOSEKIThere are two important points. The first is the increasing of the processing speed of the HAF itself. We mounted a large number of flash memory chips—two to four times as many as those in typical SSDs—and had them operate in parallel. By doing so, we were able to achieve high performance as a system.
The second is the function to compress format data in coordination with superordinate storage controllers. Devices using flash memory chips have some physical differences from HDDs. While HDDs can have data directly overwritten onto them, flash memory chips cannot. Therefore, the latter have a spare area, in addition to the data storage area, to temporarily store new data until the old data in the data storage area is erased. However, if the spare area is becoming consumed, it is necessary to have the temporarily stored data copied (not overwritten) to the data storage area to acquire an area available for new data. If the ratio of the spare area is small, the process of copying the temporarily stored data is conducted frequently, causing a decrease in performance. Thus, if we work to increase the ratio of the spare area in advance, the copying process would be eliminated as much as possible. This is what the format data compression function performs.
KOSEKIGenerally, when devices such as SSDs and HAFs are initially used in storage equipment, the superordinate storage controllers store the format data in the data storage area in order to initialize the devices. Subsequently, after operation of the equipment starts, the stored format data is overwritten by data inputted by customers. In other words, in the initial stage, customers can use only a limited volume of storage capacity in the spare area because of the existence of the format data, even though the actual volume of data stored is not so large. To tackle this problem, Hitachi has significantly compressed the format data when it is stored in the HAF module to increase the amount of the spare area.
As the HAF module can acquire a large amount of the spare area thanks to this compression, it can minimize the decrease in performance when storage devices are used in a way in which the data volume increases gradually. In other words, the HAF module is designed to keep the storage equipment achieving high performance, preventing its performance from decreasing as much as possible. Hitachi has been able to realize this function just because it also develops superordinate controllers.
Figure 1: Comparison of performance when the format data compression function is supported
KOSEKIWe struggled with how to conduct optimum control of reading and writing data, which we call read I/O and write I/O, to the flash memory chips.
The HAF controllers respond to the I/O request from the superordinate controllers ("superordinate I/O") and conduct data copying to acquire the spare area in parallel. Therefore, to achieve high performance, it is necessary to optimize the balance between the workload of the superordinate I/O and data copying. For example, placing excessive priority on the superordinate I/O will temporarily enable high performance but eventually cause the spare area to be fully consumed, making it unable to continue handling the superordinate I/O. On the contrary, placing excessive priority on data copying will help prevent the spare area from being fully consumed but cause deterioration in performance, as the superordinate I/O tends to have to wait for data copying to be conducted.
We imagined what's happening inside flash memory chips and prepared an estimated model, anticipating that certain performance should be achieved by controlling the balance this way and that. However, there was no telling whether the model was correct or not before measuring performance after the prototype products were completed. So we were very nervous until the products were completed and it was confirmed that they achieved target performance.
SUZUKIFlash memory chips are a non-volatile memory that maintains written data even without power. Characteristically, however, they lose integrity of the stored data over time. In addition, a shortcoming of flash memory chips is that they have a comparatively large number of bit errors when writing data.
Accordingly, we came up with the idea that data could be stored over a long time by having the HAF module periodically check data internally, measure the rate of bit errors occuring in the flash memory chips, and copy the data to other domains before the bit errors expand. We call this the "online data refresh function."
Flash memory chips continuously correct data by adding error-correcting codes to stored data so that bit errors are corrected if they occur. Data is lost when such errors can no longer be corrected using error-correcting codes. The online data refresh function works to read data while it is correctable, and corrects and rewrites it.
Moreover, the HAF modules are equipped with a function to recover data in coordination with the VSPs or the HUS VMs. With this function, the VSPs and the HUS VMs periodically obtain data-checking results from the HAF modules and recover the portions that cannot be read by collecting relevant data from other flash modules.
Figure 2: High reliability achieved through periodical data diagnosis and recovery function
SUZUKIA problem with the online data refresh function is that data cannot be refreshed without power. Monitoring or copying data isn't possible when the power is off. Therefore, we determined the optimum refresh interval, assuming how frequently or how long the power is off, in order to prevent data from getting lost. Also, as the quality of flash memory chips is unequal, we quantified and statistically calculated how scattered the quality was before conducting the design work.
SUZUKIWe had to forecast how data will be broken differently depending on how the storage system is used. Flash memory chips not only have a limited number of rewrites but are also affected by the rewrite interval, and data stored in them becomes more prone to breaking when it is frequently rewritten. It would take years to reconstruct the actual use conditions with such rewrite frequency.
However, it is not practical to have that much time. So we had to conduct short-term experiments to extract characteristics in use conditions and construct models of use patterns and, based on them, estimate how the stored data will be deteriorated over a long period of time. That was what was difficult for us.
KOSEKII believe flash storage will achieve increasingly higher performance. A look at the performance trends tells us that flash devices are being enhanced at a tremendous speed. Feature of Hitachi's storage products is a performance and reliability. As I am in charge of performance design, I don't want to fall behind rival companies in terms of product performance. How can I further enhance product performance? That is the subject on which I want to research going forward.
Another issue is that performance speed and reliability are in a trade-off relationship with each other. It is only natural that enterprise-class products must be reliable. On top of that, customers demand high-performance devices. However, if devices refresh too frequently for greater reliability, resources for achieving performance will be fully consumed.
As we proceed with our research, I think that the hard part of our endeavors will be with regard to how can we achieve both high performance and high reliability.
SUZUKIThat's right. As the best balance in the trade-off relationship between performance and reliability differs by how the storage systems are used, we want to investigate the tuning of the balance depending on customers' usage habits. If the use is rather limited—using flash memory chips as cache, for example—we can tune the device by prioritizing performance.
As I am in charge of the reliability design, my hope is to design the reliability of equipment by giving consideration to the accommodations for customers. I want to design reliability with considerations for SSD usage beyond storage, that customers will actually use.
KOSEKIDesigners tend to pursue the catalog specifications determined by storage vendors or, in other words, "achieve certain performance under the assumed environment." However, the actual environment in which customers use the storage systems is not the same as what is assumed. We have to go out into the field and understand customers' requests. By doing so, we want to cultivate our knowledge and know-how to such a degree that we understand what performance design is needed from the customers' viewpoint.
(Publication: January 9, 2014)