Skip to main content
Information contained in this news release is current as of the date of the press announcement, but may be subject to change without prior notice.
July 22, 2015
Analyzes huge volumes of text data on issues under debate,
and presents reasons and grounds for stances
Tokyo, July 22, 2015 --- Hitachi, Ltd. (TSE: 6501, “Hitachi”) today announced that it has developed a technology that analyzes huge volumes of text data on issues that are subject to debate, and presents reasons and grounds for either affirmative or negative opinions on those issues in English. This technology focuses on values such as health, economics and public safety, which are considered important to people and communities when expressing opinions, and uses correlations between those various values and relevant issues in the society to identify reasons and grounds with a high degree of reliability from among large volumes of news articles. By using multiple viewpoints, it is able to present reasons and grounds without bias toward a single perspective.
This is a basic technology that will contribute to artificial intelligence enabling logical dialogue between humans and computers. The technology could be applied to future systems to analyze contents of company documents, published reports or electronic medical records, in order to form opinions and generate data to support decision making.
In recent years, with the evolution of analysis technologies and information & telecommunication technologies such as the Internet, attention has been attracted to technologies that analyze “Big Data” - which is generated every day by various sensors and POS systems - and identify valuable information. At the same time, there has been an increasing demand for effective use of data such as company documents, published reports and electronic medical records to help give additional value and make management decisions. However, the development has been tough because we needed to overcome technological challenges in extracting correlations between issues and their values as mentioned above from huge volumes of text data.
In 2014, Hitachi developed a technology that extracts specified information from electronic medical records (e.g., illnesses and affected areas) with a high degree of accuracy*1. Using this technology, Hitachi has now developed a new technology for analyzing large volumes of news articles about a given topic, and presenting reasons and grounds for opinions, in English, which are highly reliable.
Details of the technology developed are as given below.
When giving reasons or grounds for opinions on a question that is subject to debate, it is assumed that people use their own respective viewpoints. Hitachi focused on values such as health, economics and public safety, which are considered important to people and communities, and created a “Value Dictionary” that systematically organizes those values based on a database*2 - a database that records affirmative and negative opinions regarding a large number of discussion topics. Specifically, a list of values that serve as a basis of decision making by people or communities, and the system extracts words demonstrating a strong relationship to the values based on the frequency of use in the database, designating those words either as “positive” or “negative” in relation to those values. Furthermore, the values and relevant words were systematically arranged by assigning a score according to “importance” based on the frequency of use. For example, in the case of the value “Health,” the relations with words, such as “exercise” which is positive, and “disease” and “obesity” which are negative, were systematically arranged.
The system identifies the types of values encompassed in recorded issues, from among the various sentences used in large volumes of news articles, and creates database expressing whether those issues have positive or negative effects on those values. For example, from an article stating that “Noise is harmful to health,” it is determined that the issue of “noise” has the negative effect of suppressing the value “Health,” and this information is managed as database. Using this method, the system created approximately 250 million metadata (issue - value correlation data) from around 9.7 million news articles.
The system uses this huge volume of metadata as well as the Value Dictionary outlined in (1) above to select multiple values with strong correlations with a given topic from among the many news articles. By searching for sentences in all of the news articles that contain one of these values, the system extracts sentences that could potentially serve as reasons or grounds for agreement or disagreement with the topic in question.
The sentences extracted using the Value Dictionary (1) and the Metadata (2) are scored based on the source of the quote, the numerical evidence and the rhetorical expressions in order to estimate whether the sentences have a strong correlation with the specified topic and value. By processing all of the sentences that could potentially serve as reasons or grounds for opinions, and evaluating scores, it is possible to select and present reliable grounds.
In order to increase processing speed and present responses within a designated time period, Hitachi constructed an architecture to realize asynchronous distributed processing of multiple algorithms in the various processes, from the analysis of the main topic to the selection of values, the article search and the presentation of reasons and grounds for opinions. This architecture executes parallel distributed processing of algorithms while at the same time executing asynchronous processing to the next process, in order to extract the desired grounds within the specified period of time.
Process of forming reasons and grounds
These technologies were developed with the cooperation of the Inui-Okazaki Laboratory, Graduate School of Information Sciences, Tohoku University (President: Susumu Satomi). By combining these four technologies, Hitachi have developed a technology that analyzes huge volumes of text data, and presents reasons and grounds for either affirmative or negative opinions on given topics. In the future, Hitachi will continue research and development aimed at achieving artificial intelligence that will enable logical dialogue between humans and computers.
The results outlined above are scheduled to be presented at ACL-IJCNLP 2015 (53rd Annual Meeting of the Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing), an international conference to be held in China during July 26-31, 2015.
Hitachi, Ltd. (TSE: 6501), headquartered in Tokyo, Japan, delivers innovations that answer society's challenges with our talented team and proven experience in global markets. The company's consolidated revenues for fiscal 2014 (ended March 31, 2015) totaled 9,761 billion yen ($81.3 billion). Hitachi is focusing more than ever on the Social Innovation Business, which includes power & infrastructure systems, information & telecommunication systems, construction machinery, high functional materials & components, automotive systems, healthcare and others. For more information on Hitachi, please visit the company's website at http://www.hitachi.com.