The Application of Data Mining for Predicting Academic Performance Using K-means Clustering and Naïve Bayes Classification

Authors

  • Zainab Mohammed Ali Computer Science Department, College of Science/ University of Diyala, Diyala, Iraq. Author
  • Noor Hasan Hassoon Computer Science Department, College of Education of pure Science/ University of Diyala, Diyala, Iraq. Author
  • Wasan Saad Ahmed Computer Science Department, College of Science/ University of Diyala, Diyala, Iraq Author
  • Hazim Noman Abed Computer Science Department, College of Science/University of Diyala, Diyala, Iraq. Author

DOI:

https://doi.org/10.61841/d4c7rb94

Keywords:

Educational Data Mining, Academic Performance, Educational Management, Naïve Bayes, Classification

Abstract

Data Mining is a multidisciplinary analyzing process that concentrating to extract and discover useful knowledge from data and information. The field of higher education management is giving a big concern to the knowledge discovery in the academic performance among different courses. Therefore, the education quality is judged by the level of the student’s success, besides which accreditation the educational institution can preserve its students. This is for the reason that there are several factors affecting the academic performance and then the quality of education. The Naïve Bayes classifier is perhaps the most broadly applied probabilistic classifier approach that can be used for data exploration. This paper is using the Naïve Bayes classifier for the educational data mining process to help in enhancing the quality distinction of the instructive system in higher education. This is by mining student evaluation data related to the instructor’s performance to study the main attributes that may affect the educational performance in different courses. Therefore, this paper is using a k-means clustering algorithm, which is used to decide the ideal cluster center so it can be the cluster centroid. Furthermore, the Naïve Bayes algorithm of the classification process is applied to the academic evaluation data to generate rules that are studied and evaluated to predict the educational performance. The proposed system helps identify the dropouts and provides the appropriate advising or counseling for educational management in performing knowledgeable decisions for considering and restructuring the educational curricula. Also, to enhance the academic experience of instructors that would ultimately improve the quality of the educational environment of an educational institution. 

Downloads

Download data is not yet available.

References

[1] Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size

fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher

Education, 28, 68-84.

[2] Mohammed, A. A. J., Burhanuddin, M. A., Basiron, H., & Tunggal, D. (2018). Key enablers of IoT

strategies in the context of smart city innovation. J. Adv. Res. Dyn. Control Syst, 10(4)

[3] Malhotra, R. (2015). A systematic review of machine learning techniques for software fault

prediction. Applied Soft Computing, 27, 504-518.

[4] Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and

combining techniques. Artificial Intelligence Review, 26(3), 159-190.

[5] Morris, A. M. (2016). The Implementation of an Accountability and Assessment System: A Case Study of

Organizational Change in Higher Education.

[6] Romiszowski, A. J. (2016). Designing instructional systems: Decision-making in course planning and

curriculum design. Routledge.

[7] Pugh, E., & Aspray, W. (1996). A history of the information machine. IEEE Annals of the History of

Computing, 18(2), 70-76.

[8] Razaque, F., Soomro, N., Shaikh, S. A., Soomro, S., Samo, J. A., Kumar, N., & Dharejo, H. (2017). Using

naïve Bayes algorithm to students' bachelor academic performance analysis. In 2017, the 4th IEEE International

Conference on Engineering Technologies and Applied Sciences (ICETAS) (pp. 1-5). IEEE.

[9] Kezar, A., Chambers, A. C., & Burkhardt, J. C. (Eds.). (2015). Higher education for the public good:

Emerging voices from a national movement. John Wiley & Sons.

[10] Ralph, M., & Stubbs, W. (2014). Integrating environmental sustainability into universities. Higher

Education, 67(1), 71-90.

[11] Michaelson, D., & Stacks, D. W. (2017). A Professional and Practitioner's Guide to Public Relations

Research, Measurement, and Evaluation. Business Expert Press.‏

[12] Taylor, B. K. (2016). Pre-service Teachers' Knowledge of Reading and Assessment for Providing

Differentiated Instruction to Struggling Readers and How This Knowledge Relates to Their Perceptions for

the Use of Retention (Doctoral dissertation).‏

[13] Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and

techniques. Morgan Kaufmann.

[14] Hassan, A. A. H., & Iskandar, M. F. (2017). Clustering Methods for Cluster-based Routing Protocols in

Wireless Sensor Networks: Comparative Study. Int. J. Appl. Eng. Res, 12(21), 11350-11360.

[15] Talib, M. S., Hassan, A., Hussin, B., Abas, Z. A., Talib, Z. S., & Rasoul, Z. S. (2018). A novel stable

clustering approach based on Gaussian distribution and relative velocity in VANETs. IJACSA) Int. J. Adv.

Comput. Sci. Appl, 9(4), 216-220

[16] Md Shah, W., Othman, M. F. I., Hassan, H., Abdul, A., Talib, M. S., & Mohammed, A. A. J. (2018). K

nearest neighbor joins and mapreduce process enforcement for the cluster of data sets in big data. Journal Of

Adv Research In Dynamical & Control Systems, 10, 690-696.

[17] Frank, E., & Hall, M. A. (2011). Data mining: practical machine learning tools and techniques. Morgan

Kaufmann.

[18] Burhanuddin, M. A., Ismail, R., Izzaimah, N., Mohammed, A. A. J., &Zainol, N. (2018). Analysis of Mobile Service Providers Performance Using Naive Bayes Data Mining Technique. International Journal of Electrical and Computer Engineering, 8(6), 5153.

[19] Cohen, P., West, S. G., & Aiken, L. S. (2014). Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.

[20] Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Lichtendahl Jr., K. C. (2017). Data mining for business analytics: concepts, techniques, and applications in R. John Wiley & Sons.

[21] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), 10-18.

[22] Asuncion, A., & Newman, D. (2007). UCI machine learning repository.

Downloads

Published

31.05.2020

How to Cite

Mohammed Ali, Z., Hasan Hassoon, N., Saad Ahmed, W., & Noman Abed, H. (2020). The Application of Data Mining for Predicting Academic Performance Using K-means Clustering and Naïve Bayes Classification. International Journal of Psychosocial Rehabilitation, 24(3), 2143-2151. https://doi.org/10.61841/d4c7rb94