Naive Bayes Classification for Software Defect Prediction
DOI:
https://doi.org/10.24090/tids.v1i1.12192Keywords:
Software flaw prediction, software defect prediction, Naïve BayesAbstract
Software defects are an inevitable aspect of software development, exerting substantial influence on the reliability and performance of software applications. This research addresses the imperative need to enhance the prediction and monitoring of software defects within the software development domain. With a focus on system stability and the prevention of software malfunctions, this study underscores the significance of proactive measures, including robust software testing, routine maintenance, and continuous system monitoring. The central challenge addressed in this research pertains to the insufficient efficiency of predicting software defects during the development phase. To address this challenge, the study employs the Naive Bayes classification method. Test results conducted on the complete dataset reveal that the Naive Bayes method yields classifications with an exceptionally high accuracy rate, reaching 98%. These findings suggest that the method holds great potential as an effective tool for predicting and preventing software defects throughout the software development process. Additionally, through linear regression analysis, the model exhibits an intercept value of -0.09359968 and a coef coefficient of 0.00761893. The outcomes of this research bear significant implications for the implementation of the Naive Bayes method in software bug prediction analysis, particularly in the utilization of the Python programming language with the assistance of Google Colab. The adoption of this method can play a pivotal role in mitigating risks and elevating the overall quality of software during the developmental stages.References
A. Hardoni, “Integrasi SMOTE pada Naive Bayes dan Logistic Regression Berbasis Particle Swarm Optimization untuk Prediksi Cacat Perangkat Lunak,” J. Sist. dan Teknol. Inf., vol. 9, no. 2, p. 144, Apr. 2021, doi: 10.26418/justin.v9i2.43173.
E. Dantas, A. Sousa Neto, M. Perkusich, H. Almeida, and A. Perkusich, “Using Bayesian Networks to Support Managing Technological Risk on Software Projects,” in Anais do I Workshop Brasileiro de Engenharia de Software Inteligente (ISE 2021), Sociedade Brasileira de Computação, Sep. 2021, pp. 1–6. doi: 10.5753/ise.2021.17277.
I. Ancveire, I. Gailite, M. Gailite, and J. Grabis, “Software Delivery Risk Management: Application of Bayesian Networks in Agile Software Development,” Inf. Technol. Manag. Sci., vol. 18, no. 1, Jan. 2015, doi: 10.1515/itms-2015-0010.
S. Das, A. Mudgal, A. Dutta, and S. R. Geedipally, “Vehicle Consumer Complaint Reports Involving Severe Incidents: Mining Large Contingency Tables,” Transp. Res. Rec. J. Transp. Res. Board, vol. 2672, no. 32, pp. 72–82, Dec. 2018, doi: 10.1177/0361198118788464.
B. Assemi, M. Hickman, and A. Paz, “Relationship between Programmed Heavy Vehicle Inspections and Traffic Safety,” Transp. Res. Rec. J. Transp. Res. Board, vol. 2675, no. 10, pp. 1420–1430, Oct. 2021, doi: 10.1177/03611981211016458.
S. A. Putri, “Prediksi Cacat Software Dengan Teknik Sampel Dan Seleksi Fitur Pada Bayesian Network,” J. Kaji. Ilm., vol. 19, no. 1, p. 17, Jan. 2019, doi: 10.31599/jki.v19i1.314.
N. Ichsan, H. Fatah, E. Ermawati, I. Indriyanti, and T. Wahyuni, “Integrasi Distribution Based Balance dan Teknik Ensemble Bagging Naive Bayes Untuk Prediksi Cacat Software,” Media J. Inform., vol. 14, no. 2, p. 79, Dec. 2022, doi: 10.35194/mji.v14i2.2623.
N. Hidayati, J. Suntoro, and G. G. Setiaji, “Perbandingan Algoritma Klasifikasi untuk Prediksi Cacat Software dengan Pendekatan CRISP-DM,” J. Sains dan Inform., vol. 7, no. 2, pp. 117–126, Nov. 2021, doi: 10.34128/jsi.v7i2.313.
A. Muzaki and A. Witanti, “SENTIMENT ANALYSIS OF THE COMMUNITY IN THE TWITTER TO THE 2020 ELECTION IN PANDEMIC COVID-19 BY METHOD NAIVE BAYES CLASSIFIER,” J. Tek. Inform., vol. 2, no. 2, pp. 101–107, Mar. 2021, doi: 10.20884/1.jutif.2021.2.2.51.
R. Yuliza, “Sistem Pakar Akurasi dalam Mengidentifikasi Penyakit Gingivitis pada Gigi Manusia dengan Metode Naive Bayes,” J. Sistim Inf. dan Teknol., Aug. 2022, doi: 10.37034/jsisfotek.v5i1.157.
K. R. Diska and K. Budayawan, “Sistem Informasi Prediksi Kelulusan Menggunakan Metode Naive Bayes Classifer (Studi Kasus: Prodi Pendidikan Teknik Informatika),” J. Pendidik. Tambusai, vol. 7, no. 1, pp. 936–943, Feb. 2023, doi: 10.31004/jptam.v7i1.5375.
D. K. NURILAHI, R. MUNADI, S. SYAHRIAL, and A. BAHRI, “Penerapan Metode Naïve Bayes pada Honeypot Dionaea dalam Mendeteksi Serangan Port Scanning,” ELKOMIKA J. Tek. Energi Elektr. Tek. Telekomun. Tek. Elektron., vol. 10, no. 2, p. 309, Apr. 2022, doi: 10.26760/elkomika.v10i2.309.
W. McKinney, Python for data analysis: Data wrangling with pandas, NumPy, and IPython, Second. O’Reilly Media, Inc, 2022.
Warto et al., “Systematic Literature Review on Named Entity Recognition: Approach, Method, and Application,” Stat. Optim. Inf. Comput., vol. 12, no. 4, pp. 907–942, Feb. 2024, doi: 10.19139/soic-2310-5070-1631.
Y. Tohma, K. Tokunaga, S. Nagase, and Y. Murata, “Structural approach to the estimation of the number of residual software faults based on the hyper-geometric distribution,” IEEE Trans. Softw. Eng., vol. 15, no. 3, pp. 345–355, Mar. 1989, doi: 10.1109/32.21762.
M. D’Ambros, M. Lanza, and R. Robbes, “An extensive comparison of bug prediction approaches,” in 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), IEEE, May 2010, pp. 31–41. doi: 10.1109/MSR.2010.5463279.
N. P. Gargote, S. Devaraj, and S. Shahapure, “Human Perception Based Color Image Segmentation,” Comput. Eng. Appl. J., vol. 2, no. 3, pp. 283–294, 2013, doi: 10.18495/comengapp.v2i3.34.
H. Kaur and A. Kaur, “An empirical study of Aging Related Bug prediction using Cross Project in Cloud Oriented Software,” Informatica, vol. 46, no. 8, Nov. 2022, doi: 10.31449/inf.v46i8.4197.
P. L. S. T. Sangeetha Yalamanchili, “Software Defect Prediction Using Machine Learning,” Int. J. Recent Technol. Eng., vol. 8, no. 2S11, pp. 1053–1057, Nov. 2019, doi: 10.35940/ijrte.B1178.0982S1119.
S. Wang, T. Liu, J. Nam, and L. Tan, “Deep Semantic Feature Learning for Software Defect Prediction,” IEEE Trans. Softw. Eng., vol. 46, no. 12, pp. 1267–1293, Dec. 2020, doi: 10.1109/TSE.2018.2877612.
B. Turhan and A. Bener, “Analysis of Naive Bayes’ assumptions on software fault data: An empirical study,” Data Knowl. Eng., vol. 68, no. 2, pp. 278–290, Feb. 2009, doi: 10.1016/j.datak.2008.10.005.
N. A. Zaidi, J. Cerquides, M. J. Carman, and G. I. Webb, “Alleviating Naive Bayes Attribute Independence Assumption by Attribute Weighting,” J. Mach. Learn. Res., vol. 14, no. 24, pp. 1947–1988, 2013.
W. Gata et al., “Algorithm Implementations Naïve Bayes, Random Forest. C4.5 on Online Gaming for Learning Achievement Predictions,” in Proceedings of the 2nd International Conference on Research of Educational Administration and Management (ICREAM 2018), Paris, France: Atlantis Press, 2019. doi: 10.2991/icream-18.2019.1.
Downloads
Published
How to Cite
License
Copyright (c) 2024 Edwin Hari Agus Prastyo, Muhammad Ainul Yaqin, Suhartono, M. Faisal, Reza Augusta Jannatul Firdaus

This work is licensed under a Creative Commons Attribution 4.0 International License.