1 AIT Asian Institute of Technology

Ensemble learning approach for diabetes diagnostic support system

AuthorJain, Rishi
Call NumberAIT Diss. no.IM-24-01
Subject(s)Diabetes--Forecasting
Medical applications
Generative adversarial networks (Computer networks)
Deep learning (Machine learning)

NoteA dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Engineering in Information Management
PublisherAsian Institute of Technology
AbstractRecent advancements in machine learning and deep learning have demonstrated significant potential in the prediction and diagnosis of critical diseases, particularly Diabetes Mellitus. This condition, characterized by a range of microvascular and macrovascular complications, requires continuous management, making early detection essential for effective prevention. This study aims to enhance diabetes prediction by developing a user-friendly application that not only predicts Diabetes Mellitus but also provides a corresponding probability score. The research also identifies the prominent factors that are to be considered while dealing with diabetes.The research employed a robust methodology utilizing the UCI dataset, which involved data collection, preprocessing, model selection, hyperparameter tuning, blending, and ultimately, the development of an application. A soft voting classifier technique was implemented, combining the strengths of Extra Trees (ET), Random Forest (RF), and Multilayer Perceptron (MLP) classifiers. This ensemble approach resulted in impressive accuracy and an Area Under the Curve (AUC) that is better than the benchmark results on the UCI dataset. To validate the model's efficacy, it was applied to both a real-world dataset and the CDC diabetes health indicator dataset, demonstrating superior performance in predicting diabetes, pre-diabetes, and non diabetic cases.Additionally, the research explored the influence of demographic factors such as gender and age on diabetes, offering a deeper understanding of how these variables affect the onset and progression of the disease. By analyzing a real-world dataset provided by a renowned diabetologist and Dr. Reddy's Lab, the study developed an ensemble learning model that categorizes individuals into three groups: diabetic, non-diabetic, or pre diabetic. The study also included a brief review of Generative Adversarial Networks (GANs) in healthcare, recognizing their potential to generate realistic synthetic data that can enhance the robustness of predictive models.In conclusion, this study advances the field of diabetes prediction through innovative ensemble learning techniques and the development of a practical application. The study concludes that gender, BMI, mental health, and physical health are significant factors that are to be considered while dealing with diabetes. In addition, hBa1c value, total cholesterol, and triglycerides are the prominent clinical factors. The study on the impact of diabetes on various genders and age groups would help medical practitioners provide tailored diagnoses to patients. The ensemble model developed significantly enhanced diabetes prediction, by categorizing patients as diabetic, non-diabetic or pre-diabetic but also providing a likelihood score which can help medical practitioners and patients to know about their diabetes status.
Year2024
TypeDissertation
SchoolSchool of Engineering and Technology
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSInformation Management (IM)
Chairperson(s)Tripathi, Nitin Kumar;Pant, Millie (Co-chairperson);
Examination Committee(s)Chutiporn Anutariya;Chaklam Silpasuwanchai;
Scholarship Donor(s)Indian Institute of Technology Roorkee (IITR), India;AIT Scholarship;
DegreeThesis (Ph.D.) - Asian Institute of Technology, 2024


Usage Metrics
View Detail0
Read PDF0
Download PDF0