Skip to main content

Hitachi

Hitachi Global

Information contained in this news release is current as of the date of the press announcement, but may be subject to change without prior notice.

May 20, 2014

Interactive communication technology for the human symbiotic robot
“EMIEW2” to enhance natural communication with humans

Estimating the comprehension level of the other party through body movement such
as nodding and tilting of the head to the side

  Tokyo, May 20, 2014 - Hitachi, Ltd. (TSE: 6501 / "Hitachi") today announced the development of interactive communication technology for the human symbiotic robot EMIEW2, which selects the optimal answer and explanation based on the subject and attributes included in a question, then based on body movement of the other party, such as nodding or tilting of the head to the side, estimates the level of comprehension to give an even more natural response. This technology enables even more flexible responses in relation to questions to realize smooth communication between humans and robots.

  Hitachi has been developing human symbiotic robotics technology since it developed EMIEW in 2005. First announced in 2007, EMIEW2 has realized motor functions such as 2-wheel autonomous locomotion at 6 km/h, around the same pace as a fast walking person, predicting and avoiding collision, and intelligent functions such as distinguishing speech from background noise using 14 microphones, and identifying objects from information on the Internet, and guiding the enquirer to the object.

  In the evolution of human symbiotic robots, free communication between humans and robots is the most important technology, and much research and development has been conducted in the area. Voice recognition, contents comprehension, response construction and voice synthesis technologies are required for technologies in free communication. In recent years, technology has been implemented such as in the mobile phone, where the subject is estimated from speech and a corresponding response is provided to the speaker. In robots, however, independent technology development was necessary as conversation is conducted at a distance with no hands-on operation by the speaker. This time, two technologies contributing to progress in robotics conversational function were developed and mounted in EMIEW2. Details of the technologies are as follow.

 
(1) Select the optimal response from several words included in the enquiry
The necessary words and word order required to identify the subject and attributes from prepared questions, are learnt and recorded in a data base. Technology was developed to recognize the subject and attributes of a question using voice recognition to identify the word order and comparison with the database. With this technology, selection of the optimal response is realized for the subject and attributes of the question posed. Deep Learning*1, a machine learning method receiving much attention in the field of recognition, was used to enable a high level of recognition.
 
(2) Ascertain the enquirer's comprehension level from movement such as nodding and tilting of the head
Video images of EMIEW2 in dialogue with humans are pre-analyzed to study body movements accompanying responses. In actual conversation, EMIEW2 captures the movement of the enquirer with an internal camera, and identifies movement such as nodding or tilting of the head to the side. Technology was developed to determine the enquirer's level of comprehension by comparing the actual response to the expected response to EMIEW2's reply. Even more human-like conversation can be achieved by understanding the enquirer's level of comprehension in relation to nature of the reply.

Employing these two technologies, EMIEW2 is able to respond with the optimal answer to freely posed questions by recognizing the subject and attributes, and further respond accordingly by watching body movement, facilitating smoother conversation.
  Hitachi will continue to promote developments towards improving the practicality of human symbiotic service robots supporting humans, including interactive communication technology.

  Details of the technology described in (1) above (selection of the optimal response from several words included in an enquiry) will be presented at a joint meeting of the special interest groups on Natural Language Processing (216th Research Meeting) and Spoken Language Processing (101th Research Meeting), both of the Signal Processing Society of Japan, to be held at the Tokyo Institute of Technology on 22nd and 23rd May 2014.

Notes

*1
Deep Learning: A neural network machine learning model based on the mechanism of nerve cells. The structure of a neural network is comprised of 3 layers: an input layer, an intermediate layer and an output layer. In Deep Learning, the intermediate layer is increased to enable the expression of even more complex models than previously possible, achieving higher recognition rates in the field of voice and image recognition.

About Hitachi, Ltd.

Hitachi, Ltd. (TSE: 6501), headquartered in Tokyo, Japan, delivers innovations that answer society's challenges with our talented team and proven experience in global markets. The company's consolidated revenues for fiscal 2013 (ended March 31, 2014) totaled 9,616 billion yen ($93.4 billion). Hitachi is focusing more than ever on the Social Innovation Business, which includes infrastructure systems, information & telecommunication systems, power systems, construction machinery, high functional materials & components, automotive systems, health care and others.

For more information on Hitachi, please visit the company's website at http://www.hitachi.com.

Download Adobe Reader
In order to read a PDF file, you need to have Adobe® Reader® installed in your computer.