Skip to main content

Hitachi

Corporate InformationResearch & Development

Predicting Incidence Rate of Diabetes Mellitus from Health Checkup Data

— Presentation at IJCAI 2016 - Workshop on Knowledge Discovery in Healthcare Data —

August 4, 2016

Report from Presenter

In developed countries, national medical costs are increasing. It is becoming more and more unsustainable relative to GDP. Healthcare insurance organizations in Japan provide health guidance service to insured persons to prevent them from lifestyle-related diseases. To improve the cost performance of health guidance service, selecting target persons for whom health guidance is more effective is very helpful. To make this possible, a quantitative prediction of possible reduction of medical costs based on past and current health conditions of people is required.


Fig. 1
Enlarge

International Joint Conferences on Artificial Intelligence is one of the most famous international conferences in Artificial Intelligence research field, and has been held biennially in odd-numbered years since 1969. IJCAI 2016 was held in New York, USA, from July 7th to 15th. And because of a record number of attendees, from this year, IJCAI conferences will be held annually. Additionally, a workshop on knowledge discovery in healthcare data was held by first time.

In IJCAI 2016 - Workshop on Knowledge Discovery in Healthcare Data, we present our research on the prediction model of incidence rate of diabetes mellitus. The title was "Predicting Incidence Rate of Diabetes Mellitus from Health Checkup Data".

Lack of a proper way to evaluate our prediction technique was bothering us continually for a long time. The difficult part of the problem is that the prediction target is actually not exactly the same thing that we can observe from the real data. The true answer of incidence rate is actually a hidden variable. To solve this evaluation problem, we proposed a more sound and more validation closed evaluation method which is inspired by the proving process of a physical theory. We supposed that our prediction of incidence probability is perfectly correct, and consider all of the real data are sampling dataset from observing experiments. Therefore, after a sorting and grouping of the records, we can quantitatively judge the quality of our prediction results by observing that whether the incidence probability of the phenomenon is consistent with the prediction probability. We introduced a numerical experiment to show that a 48.2% improvement of prediction accuracy can be achieved by using our Bayesian network prediction model compared with conventional regression model based on our proposed evaluation method. Expanding the scope to all of the life-style related diseases will be the future works.

  • Page top