1
Ensemble learning approach for diabetes diagnostic support system | |
| Author | Jain, Rishi |
| Call Number | AIT Diss. no.IM-24-01 |
| Subject(s) | Diabetes--Forecasting Medical applications Generative adversarial networks (Computer networks) Deep learning (Machine learning) |
| Note | A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Engineering in Information Management |
| Publisher | Asian Institute of Technology |
| Abstract | Recent advancements in machine learning and deep learning have demonstrated significant potential in the prediction and diagnosis of critical diseases, particularly Diabetes Mellitus. This condition, characterized by a range of microvascular and macrovascular complications, requires continuous management, making early detection essential for effective prevention. This study aims to enhance diabetes prediction by developing a user-friendly application that not only predicts Diabetes Mellitus but also provides a corresponding probability score. The research also identifies the prominent factors that are to be considered while dealing with diabetes.The research employed a robust methodology utilizing the UCI dataset, which involved data collection, preprocessing, model selection, hyperparameter tuning, blending, and ultimately, the development of an application. A soft voting classifier technique was implemented, combining the strengths of Extra Trees (ET), Random Forest (RF), and Multilayer Perceptron (MLP) classifiers. This ensemble approach resulted in impressive accuracy and an Area Under the Curve (AUC) that is better than the benchmark results on the UCI dataset. To validate the model's efficacy, it was applied to both a real-world dataset and the CDC diabetes health indicator dataset, demonstrating superior performance in predicting diabetes, pre-diabetes, and non diabetic cases.Additionally, the research explored the influence of demographic factors such as gender and age on diabetes, offering a deeper understanding of how these variables affect the onset and progression of the disease. By analyzing a real-world dataset provided by a renowned diabetologist and Dr. Reddy's Lab, the study developed an ensemble learning model that categorizes individuals into three groups: diabetic, non-diabetic, or pre diabetic. The study also included a brief review of Generative Adversarial Networks (GANs) in healthcare, recognizing their potential to generate realistic synthetic data that can enhance the robustness of predictive models.In conclusion, this study advances the field of diabetes prediction through innovative ensemble learning techniques and the development of a practical application. The study concludes that gender, BMI, mental health, and physical health are significant factors that are to be considered while dealing with diabetes. In addition, hBa1c value, total cholesterol, and triglycerides are the prominent clinical factors. The study on the impact of diabetes on various genders and age groups would help medical practitioners provide tailored diagnoses to patients. The ensemble model developed significantly enhanced diabetes prediction, by categorizing patients as diabetic, non-diabetic or pre-diabetic but also providing a likelihood score which can help medical practitioners and patients to know about their diabetes status. |
| Year | 2024 |
| Type | Dissertation |
| School | School of Engineering and Technology |
| Department | Department of Information and Communications Technologies (DICT) |
| Academic Program/FoS | Information Management (IM) |
| Chairperson(s) | Tripathi, Nitin Kumar;Pant, Millie (Co-chairperson); |
| Examination Committee(s) | Chutiporn Anutariya;Chaklam Silpasuwanchai; |
| Scholarship Donor(s) | Indian Institute of Technology Roorkee (IITR), India;AIT Scholarship; |
| Degree | Thesis (Ph.D.) - Asian Institute of Technology, 2024 |