The Application of Data Mining for Predicting Academic Performance Using K-means Clustering and Naïve Bayes Classification
DOI:
https://doi.org/10.61841/d4c7rb94Keywords:
Educational Data Mining, Academic Performance, Educational Management, Naïve Bayes, ClassificationAbstract
Data Mining is a multidisciplinary analyzing process that concentrating to extract and discover useful knowledge from data and information. The field of higher education management is giving a big concern to the knowledge discovery in the academic performance among different courses. Therefore, the education quality is judged by the level of the student’s success, besides which accreditation the educational institution can preserve its students. This is for the reason that there are several factors affecting the academic performance and then the quality of education. The Naïve Bayes classifier is perhaps the most broadly applied probabilistic classifier approach that can be used for data exploration. This paper is using the Naïve Bayes classifier for the educational data mining process to help in enhancing the quality distinction of the instructive system in higher education. This is by mining student evaluation data related to the instructor’s performance to study the main attributes that may affect the educational performance in different courses. Therefore, this paper is using a k-means clustering algorithm, which is used to decide the ideal cluster center so it can be the cluster centroid. Furthermore, the Naïve Bayes algorithm of the classification process is applied to the academic evaluation data to generate rules that are studied and evaluated to predict the educational performance. The proposed system helps identify the dropouts and provides the appropriate advising or counseling for educational management in performing knowledgeable decisions for considering and restructuring the educational curricula. Also, to enhance the academic experience of instructors that would ultimately improve the quality of the educational environment of an educational institution.
Downloads
References
[1] Gašević, D., Dawson, S., Rogers, T., & Gasevic, D. (2016). Learning analytics should not promote one size
fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher
Education, 28, 68-84.
[2] Mohammed, A. A. J., Burhanuddin, M. A., Basiron, H., & Tunggal, D. (2018). Key enablers of IoT
strategies in the context of smart city innovation. J. Adv. Res. Dyn. Control Syst, 10(4)
[3] Malhotra, R. (2015). A systematic review of machine learning techniques for software fault
prediction. Applied Soft Computing, 27, 504-518.
[4] Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and
combining techniques. Artificial Intelligence Review, 26(3), 159-190.
[5] Morris, A. M. (2016). The Implementation of an Accountability and Assessment System: A Case Study of
Organizational Change in Higher Education.
[6] Romiszowski, A. J. (2016). Designing instructional systems: Decision-making in course planning and
curriculum design. Routledge.
[7] Pugh, E., & Aspray, W. (1996). A history of the information machine. IEEE Annals of the History of
Computing, 18(2), 70-76.
[8] Razaque, F., Soomro, N., Shaikh, S. A., Soomro, S., Samo, J. A., Kumar, N., & Dharejo, H. (2017). Using
naïve Bayes algorithm to students' bachelor academic performance analysis. In 2017, the 4th IEEE International
Conference on Engineering Technologies and Applied Sciences (ICETAS) (pp. 1-5). IEEE.
[9] Kezar, A., Chambers, A. C., & Burkhardt, J. C. (Eds.). (2015). Higher education for the public good:
Emerging voices from a national movement. John Wiley & Sons.
[10] Ralph, M., & Stubbs, W. (2014). Integrating environmental sustainability into universities. Higher
Education, 67(1), 71-90.
[11] Michaelson, D., & Stacks, D. W. (2017). A Professional and Practitioner's Guide to Public Relations
Research, Measurement, and Evaluation. Business Expert Press.
[12] Taylor, B. K. (2016). Pre-service Teachers' Knowledge of Reading and Assessment for Providing
Differentiated Instruction to Struggling Readers and How This Knowledge Relates to Their Perceptions for
the Use of Retention (Doctoral dissertation).
[13] Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and
techniques. Morgan Kaufmann.
[14] Hassan, A. A. H., & Iskandar, M. F. (2017). Clustering Methods for Cluster-based Routing Protocols in
Wireless Sensor Networks: Comparative Study. Int. J. Appl. Eng. Res, 12(21), 11350-11360.
[15] Talib, M. S., Hassan, A., Hussin, B., Abas, Z. A., Talib, Z. S., & Rasoul, Z. S. (2018). A novel stable
clustering approach based on Gaussian distribution and relative velocity in VANETs. IJACSA) Int. J. Adv.
Comput. Sci. Appl, 9(4), 216-220
[16] Md Shah, W., Othman, M. F. I., Hassan, H., Abdul, A., Talib, M. S., & Mohammed, A. A. J. (2018). K
nearest neighbor joins and mapreduce process enforcement for the cluster of data sets in big data. Journal Of
Adv Research In Dynamical & Control Systems, 10, 690-696.
[17] Frank, E., & Hall, M. A. (2011). Data mining: practical machine learning tools and techniques. Morgan
Kaufmann.
[18] Burhanuddin, M. A., Ismail, R., Izzaimah, N., Mohammed, A. A. J., &Zainol, N. (2018). Analysis of Mobile Service Providers Performance Using Naive Bayes Data Mining Technique. International Journal of Electrical and Computer Engineering, 8(6), 5153.
[19] Cohen, P., West, S. G., & Aiken, L. S. (2014). Applied multiple regression/correlation analysis for the behavioral sciences. Psychology Press.
[20] Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Lichtendahl Jr., K. C. (2017). Data mining for business analytics: concepts, techniques, and applications in R. John Wiley & Sons.
[21] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), 10-18.
[22] Asuncion, A., & Newman, D. (2007). UCI machine learning repository.
Downloads
Published
Issue
Section
License
Copyright (c) 2020 AUTHOR

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.