|国家预印本平台
首页|Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm

Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm

Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm

来源:Arxiv_logoArxiv
英文摘要

This paper introduces a unified approach to cluster refinement and anomaly detection in datasets. We propose a novel algorithm that iteratively reduces the intra-cluster variance of N clusters until a global minimum is reached, yielding tighter clusters than the standard k-means algorithm. We evaluate the method using intrinsic measures for unsupervised learning, including the silhouette coefficient, Calinski-Harabasz index, and Davies-Bouldin index, and extend it to anomaly detection by identifying points whose assignment causes a significant variance increase. External validation on synthetic data and the UCI Breast Cancer and UCI Wine Quality datasets employs the Jaccard similarity score, V-measure, and F1 score. Results show variance reductions of 18.7% and 88.1% on the synthetic and Wine Quality datasets, respectively, along with accuracy and F1 score improvements of 22.5% and 20.8% on the Wine Quality dataset.

Vardhan Shorewala、Shivam Shorewala

计算技术、计算机技术

Vardhan Shorewala,Shivam Shorewala.Anomaly Detection and Improvement of Clusters using Enhanced K-Means Algorithm[EB/OL].(2025-05-30)[2025-07-16].https://arxiv.org/abs/2505.24365.点此复制

评论