Outlier Detection of Transaction Data Using DBSCAN Algorithm
DOI:
https://doi.org/10.61841/fvzwt261Keywords:
Data Mining, Outlier Detection, Euclidean Distance, Clustering, DBSCANAbstract
The supermarket is one means of marketing the company's products. Marketing activities undertaken with supermarkets provide a wide range of types of products from different companies (as producers). Consumers prefer to go to the supermarket than traditional markets due to promotions. For example, the products offered were given a discounted half price of the normal price. Consumers tend to buy more of their needs so that existing stock items in the supermarket can be drastically reduced. Therefore, the supermarket had to anticipate in order to not have a shortage of stock in the warehouse. Various techniques in data mining can be used, one of which is outlier detection. The role of an outlier detection is needed in order to detect abnormal transactions, including candidate anomalies and normal transactions, and will help the supermarket in anticipation of running out of stock items. Outlier detection is an outlier search process on a dataset and is one of the first steps to be able to perform analysis of data coherently. The main objective in outlier detection is to detect data with properties/state data with different data, or are most of the anomalies found in multidimensional datasets. One of the formidable algorithms for detecting outliers is DBSCAN. Therefore, in this study, the author will use the technique of outlier detection algorithm with expected DBSCAN to help supermarkets in anticipation of running out of stock items. The result from research that has been done by calculating 1862 products is that there was no product data that was classified as an outlier, whereas by calculating 100 first products, there are 4 product data that were classified as outliers, products with ids 80069449, 80015728, 82024920, and 80021527.
Downloads
References
[1] Asih, Nur dkk. 2016. Metode Pengclusteran Berbasis Densitas Menggunakan Algoritma DBSCAN. Bandung: Universitas Islam Bandung.
[2] Devi, Ni Made Anindya Santika dkk. 2015. Implementasi Metode Clustering DBSCAN pada Proses Pengambilan Keputusan. Bali: Universitas Udayana.
[3] Fitriany, Indah Ayu. 2017. Anomaly Detection Pada Data Konsumsi Listrik Pelanggan Menggunakan Algoritma Density Based Spasial Clustering Application with Noise, Studi Kasus: PT PLN (persero) Distribusi Jabar Area Purwakarta. Universitas Widyatama.
[4] Handriyadi, Dedy dkk. 2009. Analisis Perbandingan Clustering-Based, Distance-Based, and Density-Based, Dalam Mendeteksi Outlier. Bandung: IT Telkom.
[5] Hussain, H.I., Kamarudin, F., Thaker, H.M.T., & Salem, M.A. (2019), Artificial Neural Network to Model Managerial Timing Decision: Non-Linear Evidence of Deviation from Target Leverage, International Journal of Computational Intelligence Systems (forthcoming).
[6] Jariah, Nur. 2007. Analisis Brand Switching Untuk Memprediksi Market Share Dan Segmentasi Terhadap Jenis Merek Shampoo Dengan Marcov Chain Dan Cluster Analysis Studi Kasus: Toserba Swalayan MITRA Kartasura. Surakarta: Universitas Muhammadiyah.
[7] Jiawei, Han dkk. 2011. Data mining: Concept and Techniques, Third Edition USA: Elsevier Inc
[8] Lailasari, Siti Nur Elia dkk. 2009. Implementasi Dan Analisis Distance-Based Outlier Detection Pada Kumpulan Artikel Web Berita Berbahasa Indonesia. Bandung: Universitas Telkom.
[9] Mumtaz, K., and Duraiswamy, K. (2010). An analysis on density-based clustering of multi-dimensional spatial data. Indian Journal of Computer Science and Engineering, 1(1), pp. 8–12.
[10] Nagpal, P. B. & Mann, P. A. (2011). Comparative study of density-based clustering algorithms. International Journal of Computer Applications, 27 (11), 44-47.
[11] Prasetyo, Eko. 2014. “DATA MINING-Mengolah Data Menjadi Informasi Menggunakan Matlab." Yogyakarta: Andi Yogyakarta.
[12] Sinwar dan R. Kaushik, “Study of Euclidean and Manhattan Distance Metrics using Simple K-Means Clustering." International Journal for Research in Applicated Science and Engineering Technology (IJRASET), vol. 2, no. 5, 2014.
[13] Solimun (2002), Structural Equation Modeling LISREL dan Amos, Fakultas MIPA Universitas Brawijaya, Malang.
[14] Tan, dkk. 2006. “TAHAPAN KNOWLEDGE DISCOVERY in DATABASE.”.
[15] Vitalievichaveryanov, S., Khairzamanova, K.A., Kudashkina, N.V., Hasanova, S.R., Tuygunov, M. Efficiency of clinical application of phytofilm in treating patients with traumatic lesions of oral mucosa (2018) International Journal of Pharmaceutical Research, 10 (4), pp. 611-615. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85062407112&partnerID=40&md5=c790c8507f3c4f39fc6e07ef73e66f55
[16] M. I. Niyas ahamed (2014) Ecotoxicity concert of nano zero-valent iron particles—a review. Journal of Critical Reviews, 1 (1), 36-39.
[17] Gangurde HH, Gulecha VS, Borkar VS, Mahajan MS, Khandare RA, Mundada AS. "Swine Influenza A (H1N1 Virus): A Pandemic Disease." Systematic Reviews in Pharmacy 2.2 (2011), 110-124. Print. doi:10.4103/0975-8453.86300
Downloads
Published
Issue
Section
License
Copyright (c) 2020 AUTHOR
This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.