March 9, 2016

Hitachi Develops Technology to Anonymize
Encrypted Personal Data

Responding to market needs for personal data anonymization

Tokyo, March 9, 2016 - Hitachi, Ltd. (TSE: 6501) announced the development of technology to securely anonymize encrypted personal data.*1 Anonymization converts information related to individuals, personal information, to a form which cannot identify the individual. This newly developed technology which conducts anonymization in a more secure manner will be applied to respond to the expected increase in market needs for anonymized personal data resulting from the revision of the Japanese legislation on the Protection of Personal Information in September 2015.

In recent years, the amount and variety of data generated and collected has continued to grow with the increased use of mobile phones and sensor equipment, and big data analytics is being applied in many fields to derive value from this data. With the implementation of the revised Protection of Personal Information Act, not only will data collected from equipment but also anonymized information, that is personal data which has been processed in a way to prevent distinguishing individuals, will be legally available for third party use in the future. As a result, it can be expected that the usage of anonymized personal data will increase significantly, such as in high accuracy market research on people's movement and purchasing transactions.

The use of cloud computing is becoming increasingly popular in such big data analytics, as it allows flexibility in computing processing power. In handling sensitive data such as personal information, however, even greater security is required. Technology is being developed for practical application that encrypts data on the cloud in a form which cannot be easily decrypted by third parties but allows search and analytics to be conducted. On the other hand, k-anonymization*2 technology is a well-known technology for anonymizing personal data but with conventional technology, pre-encrypted data cannot be directly anonymized and had to be decrypted first for anonymization, thus raising security issues.

To enhance security in the anonymization of personal data, Hitachi has developed technology to encrypt personal data and enable k-anonymization of the encrypted data on the cloud. Features of the technology developed are as follows:

1. Secure generalization of encrypted data

Many k-anonymization technologies use a tree structure*3 to generalize similar data of different values, grouping data from a smaller group into a larger group in a hierarchy to anonymize the data. For example, data from the smaller regional subsets, Kanto (10 items) and Tohoku (20 items), can be anonymized by generalizing the data (30 items) in a bigger regional subset, East Japan. With conventional technology however, this tree structure could not be formed from encrypted data as the information on the smaller subset could not be read.
Hitachi applied its original technology that can compare encrypted data to determine whether given subset values are the same, to develop technology which sums up the number of subsets with the same value, and uses the aggregated data to create a tree structure. This tree structure also minimizes information loss through generalization by assigning the encrypted data in the smaller groups with less similar values to a lower position and the larger groups with more similar values to a higher position in the hierarchy.

2. High speed processing and high data security

In general, processing encrypted data is significantly slower than processing non-encrypted data. Using Hitachi's searchable encryption technology, comparison between encrypted data can be performed at high-speed as well as minimizing the amount of data processing required in the encrypted state. As a result, the overhead increase in data processing can kept within 30%*4 to successfully ensure practical processing speeds.
Further, to ensure even higher security, different encryption keys are used to encrypt the data and anonymize the encrypted data. As a result, security can be guaranteed should the encrypted data accidently leak before anonymization as only the data provider holds the decryption key.

Hitachi aims to use this technology for commercialization in FY2018 to cater to the increased use of personal data.

This technology achievement will be presented at the Technical Committee on Information Security to be held at the University of Electro-Communications, Tokyo, Japan, on 10-11 March 2016.

Overview of Technology Developed

[image]Overview of Technology Developed

Personal data: Information related to an individual but not limited to that which can identify an individual, for example, location or purchasing transaction. The definition of personal data in this new release is wider than, “information identifying living individuals” defined in Paragraph 1 of Article 2 of the Protection of Personal Information Act of Japan.
K-anonymization technology: A data processing method that makes it difficult to identify a specific individual from the data. This technology reduces the probability of identifying individual under 1/k by converting data that contains records with the same attribute more than k cases (the group of data as in line unit in database).
Tree structure: A data structure that branches out as a hierarchy; “children” data branching out from “parent” data, and becoming deeper with multiple “grandchildren” data branching out from “children” data.
Result of measuring the process from creating an encrypted tree structure to creating k-anonymized encrypted data on 30,000 records for 9 attributes with the value of k set at 2, by the anonymization technology provider (as shown in the figure above).

