基于数据场的改进DBSCAN聚类算法
n Improved DBSCAN Clustering Algorithm Based on Data Field
BSCAN算法是一种典型的基于密度的聚类算法。该算法可以识别任意形状的类簇,但聚类结果依赖于参数Eps和MinPts的选择,而且对于一些密度差别较大的数据集,可能得不到具有正确类簇个数的聚类结果,也可能将部分数据错分为噪声。为此,本文利用数据场能较好描述数据分布、反映数据关系的优势,提出一种基于数据场的改进DBSCAN聚类算法。该算法引入平均势差的概念,在聚类过程中动态的确定每个类的Eps和平均势差,从而能够在一些密度相差较大的数据集上得到较好的聚类结果。实验表明,提出的算法在AC和RI等指标上均优于DBSCAN算法。
BSCAN algorithm is a typical density-based clustering algorithm. The algorithm can discover the arbitrary-shaped clusters. However, the clustering results depend on the two parameters Eps and MinPts which are chosen by users. And for some datasets with large density differences, either may the clustering results have the uncorrected cluster number, or may the algorithm label part of the data as noise. In this paper, a new clustering algorithm called an improved DBSCAN algorithm based on data field is proposed. As data field can nicely describe the data distribution and reflect the data relationship, the advantage is taken by the algorithm. The concept of average potential difference is introduced and each class's Eps and average potential difference are dramatically determined during the clustering process. In this way, the algorithm can receive better clustering results for some clusters with large density differences. Experimental results indicate that the proposed algorithm performs better than the DBSCAN algorithm.
杨静、梁吉业、高嘉伟
计算技术、计算机技术
BSCAN数据场聚类
BSCANata fieldlustering
杨静,梁吉业,高嘉伟.基于数据场的改进DBSCAN聚类算法[EB/OL].(2012-05-25)[2025-08-03].http://www.paper.edu.cn/releasepaper/content/201205-435.点此复制
评论