Skip to main content

Hitachi

Corporate InformationResearch & Development

 

As ICT infrastructure continues to be shifted onto the "cloud," a way of securely utilizing various kinds of data retained by businesses is being searched for. At Hitachi Ltd., we have developed a technology, called "searchable encryption," that allows encrypted data to be searched as is. The key to executing secure encryption at high speed is our unique algorithm created by utilizing mathematical theory.

(Publication: January 11, 2013)

The difficulty of achieving security and availability at the same time

The number of businesses utilizing the cloud has increased, hasn't it?

Photo: SATO Hisayoshi

SATOThat's right. Hitachi is providing various cloud solutions under our "Harmonious Cloud" lineup.

In utilizing cloud services, it is necessary to entrust data to an external administrator as opposed to an administrator within the company in the usual manner. The data handled by business, however, includes data that must not be made public (such as information regarding new products and personal information) outside the company. Since there is anxiety over whether data is being properly managed or whether information is not being leaked, many businesses are hesitant to utilize cloud services.

YOSHINOTo prevent information leakage, there are ways to encrypt the data entrusted to outsiders. Since encrypted data is a character string with unclear meaning, even if that data leaked by some chance, how that data represents certain contents would be impossible to analyze. All the same, that situation itself creates a certain problem.

What problems are created?

Photo: YOSHINO Masayuki

YOSHINOWith conventional cryptographic technologies, processing concerning the encrypted data is not possible. Accordingly, it has only been possible to store encrypted data as is. It is necessary to not only simply store data on the cloud but also process it is some way or another. That processing, however, is impossible once the data is encrypted. Effectively utilizing the cloud has necessitated a cryptographic technology that can process data as encrypted.

SATOIn fact, at Hitachi, we have been researching an cryptographic technology that can process encrypted data as is for some time now. We call that technique "Private Information Processing." On considering whether we could apply this technique to cloud computing, we started the relevant research in 2010. The result of that research is the technique that I will describe in this article, namely, "searchable encryption." It is a technique that allows data to be searched in an encrypted form.

Figure 1: Challenges from the viewpoint of security of the cloud

*
Harmonious Cloud is a trademark of Hitachi, Ltd.

Randomness of ciphertext is imperative

Tell us about searchable encryption in more detail.

YOSHINOI'll explain searchable encryption in comparison with conventional cryptographic technology. In general, data before being encrypted is called "plaintext," and that after being encrypted is called "ciphertext." In the case of conventional cryptographic technology, if data is encrypted in the manner of a one-to-one correspondence of plaintext to ciphertext, data can be searched in the encrypted state as is. However, with data in that state, there is the risk of information leakage.

Photo: NAGANUMA Ken

NAGANUMAFor example, in regard to data that has only two values in the manner of human gender, only two kinds of ciphertext exist. In the same manner as designating a segment of data as "male" because it is data concerning a workplace containing mostly males, it is possible that the contents could be guessed even if it were encrypted.

The same can be said of search keywords. Since ciphertext does not change each time, that which is searched several times in relation to a certain keyword can be guessed. In other words, it is assumed that since this ciphertext comes up a lot, it can be assumed to be a popular keyword. As the number of searches increases, those searches can be considered clues for deducing the content from its ciphertext.

YOSHINOIn contrast, in the case of encryption of data by searchable encryption, ciphertext created for certain plaintext differs every time it is created. As a result, it is more difficult to deduce content from its ciphertext. Giving ciphertext a random nature thus makes it possible to search data securely.

Figure 2: Difference between data before and after encryption

How is it possible to search data if ciphertext is different every time it is created?

YOSHINOThat is the key point concerning this technique. To explain the mechanism in simple terms, when data and keywords are encrypted, a tag for search purposes is attached to their respective ciphertexts. These tags are generated from the random number that is used when respective ciphertext is created.

During a search, as shown in Figure 3, the random-number parts of "ciphertext of data" and "ciphertext of search keywords" are extracted and compared to the search-tag data. If the comparison indicates that the two kinds of data conform, the search is considered to be a hit.

NAGANUMARandom numbers are used when ciphertext is generated so that the ciphertext is different every time it is generated. However, the random-number parts are a kind of noise, so they become a hindrance when data is being checked. So the random-number parts cannot be easily handled. Accordingly, we have focused on a so-called "homomorphic function."

Figure 3: Mechanism of searchable encryption


Enlarge

Homomorphic function: The deciding factor in processing

That's the first time I've heard the term "homomorphic function."

Photo: YOSHINO Masayuki

YOSHINOIt's probably an unfamiliar term to most people. Although there are various kinds of homomorphic function, the simplest one is a function (ƒ) as ƒ(x)+ ƒ(y)= ƒ(x+y).

As I explained, the search tags attached to the ciphertext are generated from the random number that is used when respective ciphertext is created. In fact, the homomorphic function is used for this generation process. If the random number used for encrypting data is taken as x, and if the random number used for encrypting the search keyword is taken as y, the corresponding search tags are given as ƒ(x) and ƒ(y).

Moreover, I explained that during a search, the data of the random-number parts is extracted from the "ciphertext of data" and the "ciphertext of search keywords". That data, however, is retrieved in the form of the sum of the x and y values; consequently, in that form it cannot be compared to the data of the tags attached for search purposes. The data of the random-number parts is made comparable in the form of ƒ(x+y) created by using the homomorphic function.

Every single process has an meaning, right?

NAGANUMAThat's right. Although it might seem that the processing is rather randomly, if it changed for any part, it would be theoretically strange.

YOSHINOI started to study the processing method, but it didn't go very well. It involved a continuing process of trial and error, "making it and breaking it," if you will. In the end, we put our minds together in a team effort and finally worked out a processing method.

SATOWe also paid attention to the speed of processing. Since the amount of target data is huge, if the processing time got too long, the processing technique would become impractical. Given that, we used a high-speed cipher module in such as manner as to make the processing time as short as possible. During a demonstration of the technique (see photo 1), although the encrypted data contain about 10,000 articles, the search results came back in an instant.

We normally spend our time on researching public-key cryptography, but the most cipher modules used for our searchable-encryption technique is the ones of symmetric-key cryptography. Although the latter is a classical cryptography technique, we designed it with a concern for high speed.

YOSHINOAs for the homomorphic function that we just explained, high-speed processing is also possible. Using the homomorphic function for cipher modules is rare. So I think this was the symbolic example that we took advantage of our specialized knowledge of mathematics.

Photo 1: Appearance of the demonstration

Affirmation of the cipher security through games

Were there any other difficulties?

SATOVerifying the security of the cipher took a pretty long time. In the case of the developed cryptographic technology, mathematical proof is necessary as a guarantee of security. So development—including such verification—took a little less than a year.

Photo: NAGANUMA Ken

NAGANUMAFor verifying the security of the cipher, the probability of decoding the ciphertext under certain conditions is formulated in mathematical form, and if the probability is extremely low, the security of the cipher is proven.

As for the meaning of "certain conditions," conditions change in accordance with, for example, user application scenarios and the range of information given to the attacker trying to decode the ciphertext. A favorable condition from the viewpoint of the attacker (for example, although the ciphertext of the target data cannot be obtained, all other ciphertext can be obtained) can also be set up. In other words, rather than thinking from the standpoint of the user, we took the standpoint of the attacker. In fact, we followed the thinking of a villain targeting the data.

SATOThat way is often used for proving the security of ciphers. The attacker plays the role of a player in a kind of game. As Mr. NAGANUMA just explained, under a condition favorable to the attacker, we play a game; namely, two ciphertexts are shown to the attacker, and the attacker guesses which ciphertext represent the targeted data.

Since the conditions change in accordance with what should be protected, we thought up the story of the game from scratch. And since it is presupposed that searchable encryption will be applied to cloud services, we performed tests on whether the configuration of the game properly represented the security of the cloud.

The application of the technique to the cloud seems to be going well, right?

SATOYes, it is. By incorporating searchable encryption in a standard algorithm for genome analysis, namely, BLAST (Basic Local Alignment Search Tool), we created a mechanism for performing genome analysis via the cloud.

Searchable encryption is a particularly effective technique in the case that the data targeted by the search contains many of the same values. There are only four types of values in genome data, and storage and analysis demand extremely high security. We therefore thought of utilizing our technique for meeting those demands.

Once the technique was incorporated in BLAST, "approximate search"—a special feature of genome analysis—could also be supported. With searchable encryption, originally, only "exact search" was possible, but with the additional of approximate search, the search region could be widened, and searchable encryption became more convenient.

Figure 4: Image of genomic analysis using the cloud


Enlarge

Appeal to the world with a prominent technique

You are applying this technique in fields other than genome analysis, aren't you?

NAGANUMAYes, we are. As well as being applied to cloud services, searchable encryption can be applied to data-outsourcing services in general. The trend toward businesses outsourcing their data management will continue, so the number of scenarios in which our technique is useful will increase too.

SATOA special feature of this technique (namely, being able to search for data in the encrypted form) will prove useful for data backup too. When data is encrypted and backed up by searchable encryption, the only necessary parts can be searched and restored.

So it's all about pursuing user convenience, right?

Photo: SATO Hisayoshi

SATOThat's right. I think that from the standpoint of Private Information Processing, we should try to increase the kinds of processing that can be performed on encrypted data. Some examples other than search are comparison and sorting.

If we go over the top, we might give too much information to an attacker; accordingly, we should continue to think up the most appropriate methods while maintaining a good balance.

NAGANUMAI want to get the results of our research born in that manner out quickly. In doing so, I hope our work will influence researchers around the world.

YOSHINOI'm interested in starting up new businesses based on our research results. Since speed is the key in the ICT world, first, I want to take up the challenge by starting with a small business and expanding that business bit by bit.

SATOIn our research team, we are researching elemental technologies rather than system technologies. Ciphers as such make up a comparatively small research field. For that reason, from now onwards I want to create technologies that will surprise the world. Even if small, I think our prominent technologies will surely lead to more business for Hitachi.

(Publication: January 11, 2013)

Notification

  • Publication: January 11, 2013
  • Professional affiliation and official position are at the time of publication.
  • This research results has been achieved by "Research and Development on Information-security Countermeasures for Large-scale Virtual Server Environments", the Commissioned Research of Ministry of Internal Affairs and Communications, Japan.
  • Page top

Related contents

  • Page top