Machine Learning-Based Classification of Family Planning Participant Status Using Random Forest and the CRISP-DM Framework

Authors

  • Irmawati Irmawati Universitas Bina Sarana Informatika
  • Syaifur Rahmatullah Universitas Nusa Mandiri
  • Mohammad Syamsul Azis Information Systems Study Program, Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.35194/mji.v18i1.6498

Keywords:

Family Planning, Classification , Random Forest , CRISP-DM , Machine Learning, Class Imbalance

Abstract

The Family Planning (FP) program requires accurate information to support evidence-based decision-making and improve the quality of reproductive health services. Classification of FP participant status can assist health authorities in identifying participant patterns and monitoring program implementation. Previous research using the Support Vector Machine (SVM) algorithm on the same dataset achieved an accuracy of 56.20%, indicating that improvements in classification performance are still required. This study proposes the Random Forest algorithm within the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework to classify FP participant status. The dataset consists of 1,402 FP participant records obtained from SATPEL PPKB, Cilebar District, Karawang Regency. Data preprocessing included data transformation, One Hot Encoding for categorical predictor variables, Label Encoding for the target variable, and Hold-Out Validation with an 80:20 train-test split using stratified sampling. The predictor variables were registration month, wife's birth year, wife's age, and contraceptive method, while the target variable was FP participant status (New, Change Method, and Repeat). Model performance was evaluated using accuracy, precision, recall, F1-score, confusion matrix, classification report, and feature importance analysis. The Random Forest model achieved an accuracy of 59.43%, with weighted precision, recall, and F1-score of 59.00%. However, the macro-average precision, recall, and F1-score were 45.00%, 44.00%, and 44.00%, respectively, indicating performance differences across classes caused by class imbalance. The model achieved the highest F1-score for the New class (0.63), followed by the Repeat class (0.59), whereas the Change Method class obtained the lowest F1-score (0.11). Feature importance analysis identified wife's birth year and wife's age as the most influential predictor variables. Compared with the previous SVM-based model, Random Forest provided a modest improvement in accuracy and enhanced model interpretability through feature importance analysis. Nevertheless, the low macro-average performance indicates that further research should investigate class-balancing techniques and hyperparameter optimization to improve classification performance, particularly for minority classes.

References

[1] A. K. Mengistu Et Al., “Insights Into Long-Acting Reversible Contraceptive Practices In Sub-Saharan Africa: A Machine Learning Perspective,” Plos One, Vol. 21, No. 1, P. E0330960, 2026, Doi: 10.1371/Journal.Pone.0330960.

[2] M. S. Melaku, L. Yohannes, B. Sharew, M. H. Derseh, And E. A. Taye, “Application Of Machine Learning Algorithms To Model Predictors Of Informed Contraceptive Choice Among Reproductive Age Women In Six High Fertility Rate Sub Sahara Africa Countries,” Bmc Public Health, Vol. 25, No. 1, 2025, Doi: 10.1186/S12889-025-23242-W.

[3] I. Irmawati, H. Hermanto, E. H. Juningsih, S. Rahmatullah, And F. Aziz, “Prediksi Lama Tinggal Pasien Rawat Inap Di Rumah Sakit Pada Masa Pandemi Covid-19 Menggunakan Metode Ensemble Learning Dan Decission Tree,” J. Inform. Kaputama, Vol. 5, No. 2, Pp. 391–397, 2021, Doi: 10.59697/Jik.V5i2.276.

[4] A. Prasetio, M. M. Effendi, And M. N. Dwi M, “Analisis Gempa Bumi Di Indonesia Dengan Metode Clustering,” Bull. Inf. Technol., Vol. 4, No. 3, Pp. 338–343, 2023, Doi: 10.47065/Bit.V4i3.820.

[5] I. Putri And A. Razi, “Implementasi Metode Svm Rbf (Radial Basis Function) Kernel Untuk Klasifikasi Status Gizi Ada Balita,” J. Teknol. Terap. Sains 4.0, Vol. 5, No. 2, 2024, Doi: 10.29103/Tts.V5i2.19156.

[6] T. W. W. Mutiah, “Machine Learning Untuk Identifikasi Gizi Balita Menggunakan Algoritma Naïve Bayes,” 2026.

[7] R. Sholehurrohman And I. Sabda, “Machine Learning Regression Model : Exploring Regression Algorithms For Mercedes-Benz Price Prediction,” Vol. 18, No. 1, Pp. 152–162, 2025.

[8] A. S. Assiri, S. Nazir, And S. A. Velastin, “Breast Tumor Classification Using An Ensemble Machine Learning Method,” J. Imaging, Vol. 6, No. 6, May 2020, Doi: 10.3390/Jimaging6060039.

[9] L. Khikmah, “Frameworks Comparative Study Of Classification Models Based On Variable Extraction Model For Status Classify Of Contraception Method In Fertile Age Couples In Indonesia,” Indones. J. Artif. Intell. Data Min., Vol. 2, No. 1, Pp. 52–60, 2019, Doi: 10.24014/Ijaidm.V2i1.7568.

[10] R. Adawiyah, Y. Yahya, And M. Saiful, “Prediksi Tingkat Kesehatan Masyakarat Kecamatan Suralaga Berdasarkan Penggunaan Alat Kontrasepsi Menggunakan Algoritma Random Forest,” J. Print. J. Pengemb. Rekayasa Inform. Dan Komput., Vol. 1, No. 2, Pp. 89–102, 2023, Doi: 10.29408/Jprinter.V1i2.22007.

[11] A. S. Yahya, Nurhidayati, “Infotek : Jurnal Informatika Dan Teknologi Prediksi Tingkat Kesehatan Masyakarat Berdasarkan Penggunaan Alat Kontrasepsi Menggunakan Algoritma Random Forest Infotek : Jurnal Informatika Dan Teknologi Kecamatan Suralaga Merupakan Kecamatan Yang Berada Di K,” Vol. 7, No. 1, 2024.

[12] P. R. Sihombing And I. F. Yuliati, “Penerapan Metode Machine Learning Dalam Klasifikasi Risiko Kejadian Berat Badan Lahir Rendah Di Indonesia,” Matrik J. Manajemen, Tek. Inform. Dan Rekayasa Komput., Vol. 20, No. 2, Pp. 417–426, 2021, Doi: 10.30812/Matrik.V20i2.1174.

[13] P. Handayani And A. Charis Fauzan, “Klik: Kajian Ilmiah Informatika Dan Komputer Machine Learning Klasifikasi Status Gizi Balita Menggunakan Algoritma Random Forest,” Media Online), Vol. 4, No. 6, Pp. 3064–3072, 2024, Doi: 10.30865/Klik.V4i6.1909.

[14] C. Haryanto, N. Rahaningsih, And F. Muhammad Basysyar, “Komparasi Algoritma Machine Learning Dalam Memprediksi Harga Rumah,” Jati (Jurnal Mhs. Tek. Inform., Vol. 7, No. 1, Pp. 533–539, 2023, Doi: 10.36040/Jati.V7i1.6343.

[15] L. Breiman, “Rfrsf: Employee Turnover Prediction Based On Random Forests And Survival Analysis,” Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), Vol. 12343 Lncs, Pp. 503–515, 2020, Doi: 10.1007/978-3-030-62008-0_35.

[16] A. Rifai, “Classification Of Family Planning Participant Status Using Support Vector Machine ( Svm ) Based On Age And Type Of Contraceptive,” Vol. 5, No. 1, 2025.

[17] N. Wuryani And Sarifah Agustiani, “Analisa Sentimen Perkembangan Vtuber Dengan Metode Support Vector Machine Berbasis Smote,” J. Tek. Komput. Amik Bsi, Vol. 8, No. 2, Pp. 174–180, 2022, Doi: 10.31294/Jtk.V4i2.

[18] C. Mulia And A. Kurniasih, “Teknik Smote Untuk Mengatasi Imbalance Class Dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve Bayes Dan Logistic Regression,” Pros. Semin. Ilm. Nas. Online Mhs. Ilmu Komput. Dan Apl., Vol. 0, Pp. 552–559, 2023.

[19] N. V. Chawla, Kevin W. Bowyer, And Lawrence, “Deep Synthetic Minority Over-Sampling Technique,” Vol. 16, Pp. 321–357, 2020.

[20] S. Amri, “Information Science And Library Perbandingan Kerangka Model Klasifikasi Untuk Pemilihan Metode Kontrasepsi Dengan Pendekatan Crips-Dm Info Artikel,” J. Ilm. Univ. Semarang, Vol. 1, No. 1, Pp. 14–23, 2020.

Downloads

Published

2026-06-30

How to Cite

Irmawati, I., Syaifur Rahmatullah, & Syamsul Azis, M. (2026). Machine Learning-Based Classification of Family Planning Participant Status Using Random Forest and the CRISP-DM Framework. Media Jurnal Informatika, 18(1), 202–215. https://doi.org/10.35194/mji.v18i1.6498

Similar Articles

1 2 3 > >> 

You may also start an advanced similarity search for this article.