Resolving Class Imbalance in Medical Classification: Technique Comparison and Performance Evaluation
محورهای موضوعی : Machine learningAbdallah Maiti 1 , Mohamed Hanini 2 , Abdallah Abarda 3
1 - Laboratory of Computing, Networks, Mobility and Modelling (IR2M) FST, Hassan First University of Settat, Morocco
2 - Laboratory of Computing, Networks, Mobility and Modelling (IR2M) FST, Hassan First University of Settat, Morocco
3 - Laboratory LM2CE, Faculty of Economic Sciences and Management, Hassan First University of Settat, Morocco
کلید واژه: Data Imbalance, Techniques for Resolving Data Class Imbalance, Oversampling, Cost-Sensitive learning, Convolutional Neural Networks, Classification, Model Performance, Medical Diagnostics.,
چکیده مقاله :
The problem of unbalanced data is a common one in medical diagnostics. This problem can reduce the accuracy of classification models and affect the validity of results. The aim of our paper is to compare several techniques for correcting class imbalances in medical datasets and to evaluate the impact of these techniques on machine learning performance.
In our paper, we used an imbalanced dataset to train a convolutional neural network (CNN) model. We then tested correction techniques such as sampling and cost-sensitive learning. Finally, we used recall, precision, accuracy and F1 score to evaluate the model's performance.
The results show that the use of correction techniques led to a significant improvement in the performance of the classification model. The cost-sensitive learning technique gave the best results, particularly for the detection of minority classes. This method increased the weight of classification errors associated with minority classes, thus improving the detection of critical cases. The results of this study underline the importance of dealing with imbalances in the data to improve the performance of classification models in the medical field. The use of methods such as cost-sensitive learning not only improves model performance, but also enables more reliable decisions to be made, which is essential for ensuring more accurate diagnoses and better quality of care.
The problem of unbalanced data is a common one in medical diagnostics. This problem can reduce the accuracy of classification models and affect the validity of results. The aim of our paper is to compare several techniques for correcting class imbalances in medical datasets and to evaluate the impact of these techniques on machine learning performance.
In our paper, we used an imbalanced dataset to train a convolutional neural network (CNN) model. We then tested correction techniques such as sampling and cost-sensitive learning. Finally, we used recall, precision, accuracy and F1 score to evaluate the model's performance.
The results show that the use of correction techniques led to a significant improvement in the performance of the classification model. The cost-sensitive learning technique gave the best results, particularly for the detection of minority classes. This method increased the weight of classification errors associated with minority classes, thus improving the detection of critical cases. The results of this study underline the importance of dealing with imbalances in the data to improve the performance of classification models in the medical field. The use of methods such as cost-sensitive learning not only improves model performance, but also enables more reliable decisions to be made, which is essential for ensuring more accurate diagnoses and better quality of care.