Evaluating Machine Learning Models Across Feature Extraction and Data Balancing Scenarios for Coretax Sentiment Analysis
DOI:
https://doi.org/10.35194/mji.v17i2.5968Keywords:
Coretax, Data Balancing, Feature Extraction, Machine Learning, Sentiment AnalysisAbstract
The implementation of the Core Tax Administration System (Coretax) by the Indonesian Directorate General of Taxes has generated diverse public responses on social media, particularly on platform X, making sentiment analysis a relevant approach to assess public perception of this policy. This study aims to evaluate the performance of machine learning classifiers across different feature extraction and data balancing scenarios. Three machine learning classifiers, namely Multinomial Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression were evaluated under four experimental scenarios combining two feature extraction methods, namely Term Frequency–Inverse Document Frequency (TF-IDF) and Bag of Words (BoW), with original and balanced data distributions. A dataset of more than 50,000 Coretax-related posts collected from platform X was preprocessed and automatically labeled into positive, negative, and neutral sentiment classes using a pretrained IndoBERT sentiment model. A brief manual inspection of a random subset indicates moderate agreement between automatic and manual labels, highlighting potential noise while supporting the use of automatic labeling for comparative analysis. The results show that performance is shaped by the combined effects of representation and data distribution rather than algorithm choice alone. Logistic Regression consistently achieved the most stable and competitive performance across all scenarios, with accuracy values ranging from approximately 0.80 to 0.83 and macro F1-scores around 0.72–0.73. TF-IDF generally provided more stable performance, while data balancing improved prediction fairness for minority sentiment classes despite a slight decrease in overall accuracy. These findings demonstrate that Logistic Regression is the most robust model for Coretax sentiment analysis across varying feature extraction and data balancing conditions and provide practical insights into the influence of data representation and distribution on sentiment classification performance.References
[1] A. Purwitasari, B. Mutafarida, and Yuliani, “Urgensi pajak dalam mendorong pembangunan infrastruktur dan pertumbuhan ekonomi di Indonesia,” Jurnal Ilmiah Ekonomi dan Manajemen, vol. 2, no. 6, 2024.
[2] R. S. Aliyudin, E. F. Ahmad, and N. Nizhan, “Pengaruh sistem perpajakan, diskriminasi, teknologi dan informasi perpajakan terhadap persepsi wajib pajak mengenai penggelapan pajak,” J-AKSI?: Jurnal Akuntansi Dan Sistem Informasi, vol. 2, no. 2, 2021, doi: 10.31949/j-aksi.v2i2.1615.
[3] Republik Indonesia, Peraturan Presiden (Perpres) Nomor 40 Tahun 2018 tentang Pembaruan Sistem Administrasi Perpajakan. 2018.
[4] G. Naufal Wala and R. Tesalonika, “Transformasi Administrasi Perpajakan Melalui Coretax: Analisis Hukum dan Akuntansi,” Jurnal Komunikasi dan Ilmu Sosial, vol. 2, no. 4, 2024, doi: 10.38035/jkis.v2i4.1479.
[5] M. Saraswati and D. Riminarsih, “Analisis sentimen terhadap pelayanan KRL Commuterline berdasarkan data Twitter menggunakan algortima Bernoulli Naive Bayes,” Jurnal Ilmiah Informatika Komputer, vol. 25, no. 3, 2020, doi: 10.35760/ik.2020.v25i3.3256.
[6] J. A. Nursiyono and C. Chotimah, “Analisis Sentimen Netizen Twitter terhadap Pemberitaan PPN Sembako dan Jasa Pendidikan dengan Pendekatan Social Network Analysis dan Naive Bayes Classifier,” J Statistika: Jurnal Ilmiah Teori dan Aplikasi Statistika, vol. 14, no. 1, 2021, doi: 10.36456/jstat.vol14.no1.a3868.
[7] M. I. Fauzy and F. F. Abdulloh, “Sentiment Analysis of Online Vehicle Tax Renewal Application Users Using SVM Algorithm,” Journal of Applied Informatics and Computing, vol. 8, no. 2, 2024, doi: 10.30871/jaic.v8i2.8654.
[8] E. Sutoyo and A. Almaarif, “Twitter sentiment analysis of the relocation of Indonesia’s capital city,” Bulletin of Electrical Engineering and Informatics, vol. 9, no. 4, 2020, doi: 10.11591/eei.v9i4.2352.
[9] f. Fathoni, A. Faradhisa Ansori, I. Nailah Ramadhani, C. Rahmi Anissa, and S. Amelia Putri, “Analisis sentimen Masyarakat Indonesia di Twitter terhadap Sistem Perpajakan ‘Coretax’ menggunakan metode Naïve Bayes,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 4, 2025, doi: 10.36040/jati.v9i4.14214.
[10] A. K. -, “Sentiment Analysis of X(twitter) Data-a Review Study,” International Journal For Multidisciplinary Research, vol. 6, no. 2, 2024, doi: 10.36948/ijfmr.2024.v06i02.15636.
[11] M. A. Khder, “Web scraping or web crawling: State of art, techniques, approaches and application,” International Journal of Advances in Soft Computing and its Applications, vol. 13, no. 3, 2021, doi: 10.15849/ijasca.211128.11.
[12] S. Dhummad, “The Imperative of Exploratory Data Analysis in Machine Learning,” Scholars Journal of Engineering and Technology, vol. 13, no. 01, 2025, doi: 10.36347/sjet.2025.v13i01.005.
[13] Y. Findawati, Buku Ajar Text Mining. 2020. doi: 10.21070/2020/978-623-6833-19-3.
[14] D. Purnamasari et al., Pengantar Metode Analisis Sentimen. 2024.
[15] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” 2024. doi: 10.18653/v1/2020.aacl-main.85.
[16] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019.
[17] crypter70, “IndoBERT-Sentiment-Analysis,” https://huggingface.co/crypter70/IndoBERT-Sentiment-Analysis.
[18] V. R. Joseph, “Optimal ratio for data splitting,” Stat Anal Data Min, vol. 15, no. 4, 2022, doi: 10.1002/sam.11583.
[19] T. P. Kurniawan, M. A. Hariyadi, and C. Crysdian, “Perbandingan feature extraction TF-IDF dan BoW untuk analisis sentimen berbasis SVM,” Jurnal Cahaya Mandalika, vol. 3, no. 2, 2023.
[20] N. A. Saran and F. Nar, “Fast binary logistic regression,” PeerJ Comput Sci, vol. 11, 2025, doi: 10.7717/PEERJ-CS.2579.
[21] S. Bates, T. Hastie, and R. Tibshirani, “Cross-Validation: What Does It Estimate and How Well Does It Do It?,” J Am Stat Assoc, vol. 119, no. 546, 2024, doi: 10.1080/01621459.2023.2197686.
[22] M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.2994222.
[2] R. S. Aliyudin, E. F. Ahmad, and N. Nizhan, “Pengaruh sistem perpajakan, diskriminasi, teknologi dan informasi perpajakan terhadap persepsi wajib pajak mengenai penggelapan pajak,” J-AKSI?: Jurnal Akuntansi Dan Sistem Informasi, vol. 2, no. 2, 2021, doi: 10.31949/j-aksi.v2i2.1615.
[3] Republik Indonesia, Peraturan Presiden (Perpres) Nomor 40 Tahun 2018 tentang Pembaruan Sistem Administrasi Perpajakan. 2018.
[4] G. Naufal Wala and R. Tesalonika, “Transformasi Administrasi Perpajakan Melalui Coretax: Analisis Hukum dan Akuntansi,” Jurnal Komunikasi dan Ilmu Sosial, vol. 2, no. 4, 2024, doi: 10.38035/jkis.v2i4.1479.
[5] M. Saraswati and D. Riminarsih, “Analisis sentimen terhadap pelayanan KRL Commuterline berdasarkan data Twitter menggunakan algortima Bernoulli Naive Bayes,” Jurnal Ilmiah Informatika Komputer, vol. 25, no. 3, 2020, doi: 10.35760/ik.2020.v25i3.3256.
[6] J. A. Nursiyono and C. Chotimah, “Analisis Sentimen Netizen Twitter terhadap Pemberitaan PPN Sembako dan Jasa Pendidikan dengan Pendekatan Social Network Analysis dan Naive Bayes Classifier,” J Statistika: Jurnal Ilmiah Teori dan Aplikasi Statistika, vol. 14, no. 1, 2021, doi: 10.36456/jstat.vol14.no1.a3868.
[7] M. I. Fauzy and F. F. Abdulloh, “Sentiment Analysis of Online Vehicle Tax Renewal Application Users Using SVM Algorithm,” Journal of Applied Informatics and Computing, vol. 8, no. 2, 2024, doi: 10.30871/jaic.v8i2.8654.
[8] E. Sutoyo and A. Almaarif, “Twitter sentiment analysis of the relocation of Indonesia’s capital city,” Bulletin of Electrical Engineering and Informatics, vol. 9, no. 4, 2020, doi: 10.11591/eei.v9i4.2352.
[9] f. Fathoni, A. Faradhisa Ansori, I. Nailah Ramadhani, C. Rahmi Anissa, and S. Amelia Putri, “Analisis sentimen Masyarakat Indonesia di Twitter terhadap Sistem Perpajakan ‘Coretax’ menggunakan metode Naïve Bayes,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 4, 2025, doi: 10.36040/jati.v9i4.14214.
[10] A. K. -, “Sentiment Analysis of X(twitter) Data-a Review Study,” International Journal For Multidisciplinary Research, vol. 6, no. 2, 2024, doi: 10.36948/ijfmr.2024.v06i02.15636.
[11] M. A. Khder, “Web scraping or web crawling: State of art, techniques, approaches and application,” International Journal of Advances in Soft Computing and its Applications, vol. 13, no. 3, 2021, doi: 10.15849/ijasca.211128.11.
[12] S. Dhummad, “The Imperative of Exploratory Data Analysis in Machine Learning,” Scholars Journal of Engineering and Technology, vol. 13, no. 01, 2025, doi: 10.36347/sjet.2025.v13i01.005.
[13] Y. Findawati, Buku Ajar Text Mining. 2020. doi: 10.21070/2020/978-623-6833-19-3.
[14] D. Purnamasari et al., Pengantar Metode Analisis Sentimen. 2024.
[15] B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” 2024. doi: 10.18653/v1/2020.aacl-main.85.
[16] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019.
[17] crypter70, “IndoBERT-Sentiment-Analysis,” https://huggingface.co/crypter70/IndoBERT-Sentiment-Analysis.
[18] V. R. Joseph, “Optimal ratio for data splitting,” Stat Anal Data Min, vol. 15, no. 4, 2022, doi: 10.1002/sam.11583.
[19] T. P. Kurniawan, M. A. Hariyadi, and C. Crysdian, “Perbandingan feature extraction TF-IDF dan BoW untuk analisis sentimen berbasis SVM,” Jurnal Cahaya Mandalika, vol. 3, no. 2, 2023.
[20] N. A. Saran and F. Nar, “Fast binary logistic regression,” PeerJ Comput Sci, vol. 11, 2025, doi: 10.7717/PEERJ-CS.2579.
[21] S. Bates, T. Hastie, and R. Tibshirani, “Cross-Validation: What Does It Estimate and How Well Does It Do It?,” J Am Stat Assoc, vol. 119, no. 546, 2024, doi: 10.1080/01621459.2023.2197686.
[22] M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y. Alzahrani, and R. Budiarto, “Evaluating Trust Prediction and Confusion Matrix Measures for Web Services Ranking,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.2994222.
Downloads
Published
2025-12-31
Issue
Section
Articles