关于对K-means算法中噪音数据处理的综述
he influence of noise data in K-means algorithm
在大数据时代,数据是争夺最激烈的宝贵资源,如何从这些资源中挖掘出有效、有价值的信息是互联网时代的热点,聚类是其中常见的任务之一,而聚类算法是聚类工作的核心,它可以用于发现整个数据集中存在的聚集性。其中K-means聚类算法因原理简单且具有良好的时间效率而得到广泛的应用。但是K-means聚类算法同时也面临很多问题,例如噪音数据的影响等等。本文以噪音数据对K-means算法的影响为切入点,总结和归纳了学术界近些年对K-means算法中噪音数据影响和改进的研究,分析了噪音数据的影响以及各个改进方案的优缺点,以期未来在K-means算法的噪音数据处理上能有明确的方向。
When it comes to the age of big data. The data is a very important resource. How to take good advantage of big data is a hot research directions which is also called \'data mining\'. Clustering algorithm is a common task in data mining. And K-means clustering is a popular algorithm in clustering algorithm because of it\'s simple and efficiency.In this paper, the author concluded the recent researches about the influence of noise data in K-means clustering algorithm. At last, this paper point out the influence of noise data, and the analysis of optimization in K-means to reduce this influence, so as to point out the trend to solve this problem in the future.
张成文、张皓
计算技术、计算机技术
计算机软件与理论K-means聚类算法噪音数据聚类中心
computer software and theoryK-means clustering algorithmcluster center.
张成文,张皓.关于对K-means算法中噪音数据处理的综述[EB/OL].(2017-11-20)[2025-08-18].http://www.paper.edu.cn/releasepaper/content/201711-114.点此复制
评论