TY - JOUR AU - Wisaeng, Kittipol AU - Sriboonlue, Pankom AU - Muangmeesri, Benchalak PY - 2026 TI - Improving Diabetes Risk Prediction Using Ensemble Boosting and SMOTE-Based Class Balancing JF - Journal of Computer Science VL - 22 IS - 1 DO - 10.3844/jcssp.2026.61.74 UR - https://thescipub.com/abstract/jcssp.2026.61.74 AB - Accurate diabetes prediction is vital for early intervention, optimized resource allocation, and minimizing long-term complications. This study presents a comparative evaluation of traditional and advanced machine learning models for diabetes classification using a structured clinical dataset. Seven baseline algorithms were assessed against five advanced ensemble methods: CatBoost, LightGBM, XGBoost, Voting Ensemble, and Stacking Ensemble. To improve algorithm learning, the Synthetic Minority Over-sampling Technique (SMOTE) and feature normalization were employed. The algorithm’s effectiveness was carefully evaluated using accuracy, precision, recall, and the F1 score. Results show that advanced models substantially outperformed traditional ones, with CatBoost achieving the highest F1 score of 0.7625. Feature importance analysis identified glucose, BMI, and age as the most influential indicators, consistent with clinical evidence. These findings demonstrate the potential of ensemble learning and boosting strategies for building interpretable, scalable, and effective diagnostic support tools in healthcare settings.