Multi-Class Fault Detection under Class-Imbalance in Wireless Sensor Network Using Random Undersampling and Extra Trees
DOI:
https://doi.org/10.35194/mji.v17i2.5959Keywords:
Class imbalance, Extra-Tree, Fault detection, Multi-class classification, Random UndersamplingAbstract
Wireless Sensor Networks (WSNs) are widely used in various monitoring applications, including environmental observation, smart infrastructure, and Internet of Things (IoT) systems. Despite their widespread adoption, WSNs are highly susceptible to data errors caused by sensor degradation, hardware malfunctions, environmental disturbances, and communication issues. These faults can significantly reduce data reliability and lead to incorrect system decisions if not properly handled. This study proposes a multi-class data-fault detection approach for WSNs under imbalanced data conditions by integrating Random Undersampling (RUS) with the Extra-Trees classification algorithm. The proposed framework aims to address the class imbalance problem commonly found in sensor fault datasets while improving fault detection performance across multiple fault types. Experiments were conducted using a WSN dataset containing temperature and humidity measurements, in which three fault types: Bias, Drift, and Spike were analyzed alongside normal sensor data. The experimental results demonstrate that Random Undersampling leads to a substantial improvement in classification performance. Without RUS, the Extra-Trees classifier achieved an accuracy of 48% and failed to detect spike faults. After applying RUS, classification accuracy increased to 91%, accompanied by balanced precision, recall, and F1-score values across all classes. These findings indicate that the combination of Random Undersampling and Extra-Trees provides an effective and reliable solution for multi-class data fault detection in WSN environments.References
[1] K. Shafique, B. A. Khawaja, F. Sabir, S. Qazi, and M. Mustaqim, “Internet of Things (IoT) for Next-Generation Smart Systems: A Review of Current Challenges, Future Trends and Prospects for Emerging 5G-IoT Scenarios,” IEEE Access, vol. 8, pp. 23022–23040, 2020, doi: 10.1109/ACCESS.2020.2970118.
[2] Z. Noshad et al., “Fault detection in wireless sensor networks through the random forest classifier,” Sensors (Switzerland), vol. 19, no. 7, pp. 1–21, 2019, doi: 10.3390/s19071568.
[3] R. Ahmad, R. Wazirali, and T. Abu-Ain, “Machine Learning for Wireless Sensor Networks Security: An Overview of Challenges and Issues,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134730.
[4] S. Zidi, T. Moulahi, and B. Alaya, “Fault detection in wireless sensor networks through SVM classifier,” IEEE Sensors Journal, vol. 18, no. 1, pp. 340–347, 2018, doi: 10.1109/JSEN.2017.2771226.
[5] U. Saeed, S. U. Jan, Y.-D. Lee, and I. Koo, “Fault diagnosis based on extremely randomized trees in wireless sensor networks,” Reliability Engineering & System Safety, vol. 205, no. September 2020, p. 107284, Jan. 2021, doi: 10.1016/j.ress.2020.107284.
[6] R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results,” in 2020 11th International Conference on Information and Communication Systems (ICICS), 2020, pp. 243–248, doi: 10.1109/ICICS49469.2020.239556.
[7] P. Kaur and A. Gosain, “Comparing the Behavior of Oversampling and Undersampling Approach of Class Imbalance Learning by Combining Class Imbalance Problem with Noise,” in ICT Based Innovations, 2018, pp. 23–30.
[8] T. T. Khuat and M. H. Le, “Evaluation of Sampling-Based Ensembles of Classifiers on Imbalanced Data for Software Defect Prediction Problems,” SN Computer Science, vol. 1, no. 2, p. 108, 2020, doi: 10.1007/s42979-020-0119-4.
[9] A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, “Data Level Preprocessing Methods,” in Learning from Imbalanced Data Sets, Cham: Springer International Publishing, 2018, pp. 79–121.
[10] S. Bagui and K. Li, “Resampling imbalanced data for network intrusion detection datasets,” Journal of Big Data, vol. 8, 2021, doi: 10.1186/s40537-020-00390-x.
[11] W.-C. Lin, C.-F. Tsai, Y.-H. Hu, and J.-S. Jhang, “Clustering-based undersampling in class-imbalanced data,” Information Sciences, vol. 409–410, pp. 17–26, 2017, doi: https://doi.org/10.1016/j.ins.2017.05.008.
[12] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Machine Learning, vol. 63, no. 1, pp. 3–42, 2006, doi: 10.1007/s10994-006-6226-1.
[13] U. Saeed, S. Jan, Y.-D. Lee, and I. Koo, “Fault diagnosis based on extremely randomized trees in wireless sensor networks,” Reliab. Eng. Syst. Saf., vol. 205, p. 107284, 2021, doi: 10.1016/j.ress.2020.107284.
[14] R. Clavijo-López et al., “Energy-aware and Context-aware Fault Detection Framework for Wireless Sensor Networks,” J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl., vol. 14, pp. 1–13, 2023, doi: 10.58346/jowua.2023.i3.001.
[15] L. K. Wardhani, R. A. Febriyanto, and N. Anggraini, “Fault Detection in Wireless Sensor Networks Data Using Random Under Sampling and Extra-Tree Algorithm,” pp. 1–6, 2022, doi: 10.1109/citsm56380.2022.9935888.
[16] S. Maataoui, G. Bencheikh, and G. Bencheikh, “Predictive Maintenance in the Industrial Sector: A CRISP-DM Approach for Developing Accurate Machine Failure Prediction Models,” 2023 Fifth International Conference on Advances in Computational Tools for Engineering Applications (ACTEA), pp. 223–227, 2023, doi: 10.1109/actea58025.2023.10193983.
[17] S. Suthaharan, M. Alzahrani, S. Rajasegarar, C. Leckie, and M. Palaniswami, “Labelled data collection for anomaly detection in wireless sensor networks,” in 2010 Sixth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, 2010, pp. 269–274, doi: 10.1109/ISSNIP.2010.5706782.
[18] U. Saeed, Y.-D. Lee, S. U. Jan, and I. Koo, “CAFD: Context-Aware Fault Diagnostic Scheme towards Sensor Faults Utilizing Machine Learning,” Sensors , vol. 21, no. 2. 2021, doi: 10.3390/s21020617.
[19] J. Hancock, T. Khoshgoftaar, and J. Johnson, “The Effects of Random Undersampling for Big Data Medicare Fraud Detection,” 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE), pp. 141–146, 2022, doi: 10.1109/sose55356.2022.00023.
[20] R. Zuech, J. Hancock, and T. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, 2021, doi: 10.1186/s40537-021-00460-8.
[21] S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-Efficient Deployment in Static and Mobile Heterogeneous Multi-Hop Wireless Sensor Networks,” Ieee Transactions on Wireless Communications, vol. 21, no. 7, pp. 4973–4988, 2022, doi: 10.1109/twc.2021.3135385.
[22] X. Liu et al., “ResGRU: A Novel Hybrid Deep Learning Model for Compound Fault Diagnosis in Photovoltaic Arrays Considering Dust Impact,” Sensors, vol. 25, no. 4, p. 1035, 2025, doi: 10.3390/s25041035.
[2] Z. Noshad et al., “Fault detection in wireless sensor networks through the random forest classifier,” Sensors (Switzerland), vol. 19, no. 7, pp. 1–21, 2019, doi: 10.3390/s19071568.
[3] R. Ahmad, R. Wazirali, and T. Abu-Ain, “Machine Learning for Wireless Sensor Networks Security: An Overview of Challenges and Issues,” Sensors, vol. 22, no. 13, 2022, doi: 10.3390/s22134730.
[4] S. Zidi, T. Moulahi, and B. Alaya, “Fault detection in wireless sensor networks through SVM classifier,” IEEE Sensors Journal, vol. 18, no. 1, pp. 340–347, 2018, doi: 10.1109/JSEN.2017.2771226.
[5] U. Saeed, S. U. Jan, Y.-D. Lee, and I. Koo, “Fault diagnosis based on extremely randomized trees in wireless sensor networks,” Reliability Engineering & System Safety, vol. 205, no. September 2020, p. 107284, Jan. 2021, doi: 10.1016/j.ress.2020.107284.
[6] R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results,” in 2020 11th International Conference on Information and Communication Systems (ICICS), 2020, pp. 243–248, doi: 10.1109/ICICS49469.2020.239556.
[7] P. Kaur and A. Gosain, “Comparing the Behavior of Oversampling and Undersampling Approach of Class Imbalance Learning by Combining Class Imbalance Problem with Noise,” in ICT Based Innovations, 2018, pp. 23–30.
[8] T. T. Khuat and M. H. Le, “Evaluation of Sampling-Based Ensembles of Classifiers on Imbalanced Data for Software Defect Prediction Problems,” SN Computer Science, vol. 1, no. 2, p. 108, 2020, doi: 10.1007/s42979-020-0119-4.
[9] A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, “Data Level Preprocessing Methods,” in Learning from Imbalanced Data Sets, Cham: Springer International Publishing, 2018, pp. 79–121.
[10] S. Bagui and K. Li, “Resampling imbalanced data for network intrusion detection datasets,” Journal of Big Data, vol. 8, 2021, doi: 10.1186/s40537-020-00390-x.
[11] W.-C. Lin, C.-F. Tsai, Y.-H. Hu, and J.-S. Jhang, “Clustering-based undersampling in class-imbalanced data,” Information Sciences, vol. 409–410, pp. 17–26, 2017, doi: https://doi.org/10.1016/j.ins.2017.05.008.
[12] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Machine Learning, vol. 63, no. 1, pp. 3–42, 2006, doi: 10.1007/s10994-006-6226-1.
[13] U. Saeed, S. Jan, Y.-D. Lee, and I. Koo, “Fault diagnosis based on extremely randomized trees in wireless sensor networks,” Reliab. Eng. Syst. Saf., vol. 205, p. 107284, 2021, doi: 10.1016/j.ress.2020.107284.
[14] R. Clavijo-López et al., “Energy-aware and Context-aware Fault Detection Framework for Wireless Sensor Networks,” J. Wirel. Mob. Networks Ubiquitous Comput. Dependable Appl., vol. 14, pp. 1–13, 2023, doi: 10.58346/jowua.2023.i3.001.
[15] L. K. Wardhani, R. A. Febriyanto, and N. Anggraini, “Fault Detection in Wireless Sensor Networks Data Using Random Under Sampling and Extra-Tree Algorithm,” pp. 1–6, 2022, doi: 10.1109/citsm56380.2022.9935888.
[16] S. Maataoui, G. Bencheikh, and G. Bencheikh, “Predictive Maintenance in the Industrial Sector: A CRISP-DM Approach for Developing Accurate Machine Failure Prediction Models,” 2023 Fifth International Conference on Advances in Computational Tools for Engineering Applications (ACTEA), pp. 223–227, 2023, doi: 10.1109/actea58025.2023.10193983.
[17] S. Suthaharan, M. Alzahrani, S. Rajasegarar, C. Leckie, and M. Palaniswami, “Labelled data collection for anomaly detection in wireless sensor networks,” in 2010 Sixth International Conference on Intelligent Sensors, Sensor Networks and Information Processing, 2010, pp. 269–274, doi: 10.1109/ISSNIP.2010.5706782.
[18] U. Saeed, Y.-D. Lee, S. U. Jan, and I. Koo, “CAFD: Context-Aware Fault Diagnostic Scheme towards Sensor Faults Utilizing Machine Learning,” Sensors , vol. 21, no. 2. 2021, doi: 10.3390/s21020617.
[19] J. Hancock, T. Khoshgoftaar, and J. Johnson, “The Effects of Random Undersampling for Big Data Medicare Fraud Detection,” 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE), pp. 141–146, 2022, doi: 10.1109/sose55356.2022.00023.
[20] R. Zuech, J. Hancock, and T. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, 2021, doi: 10.1186/s40537-021-00460-8.
[21] S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Energy-Efficient Deployment in Static and Mobile Heterogeneous Multi-Hop Wireless Sensor Networks,” Ieee Transactions on Wireless Communications, vol. 21, no. 7, pp. 4973–4988, 2022, doi: 10.1109/twc.2021.3135385.
[22] X. Liu et al., “ResGRU: A Novel Hybrid Deep Learning Model for Compound Fault Diagnosis in Photovoltaic Arrays Considering Dust Impact,” Sensors, vol. 25, no. 4, p. 1035, 2025, doi: 10.3390/s25041035.
Downloads
Published
2025-12-31
Issue
Section
Articles