基于差异性采样的流数据聚类算法
针对传统聚类算法对流数据进行聚类时面临时间复杂度高,存储空间需求大以及准确度较低的问题,提出一种基于差异性采样的流数据聚类算法。首先利用差异性采样法对流数据进行采样并用样本点构造核矩阵,然后利用核模糊C均值聚类算法对核矩阵中的点进行聚类得到一个带有标记的样本核矩阵,最后利用带有标记的样本核矩阵对流数据中的点进行划分。同时利用衰退聚类机制,实时更新样本核矩阵。实验结果表明,相比于传统聚类算法,该算法实现了更低的时间复杂度,同时实时聚类,得到较为理想的聚类结果。
oncerning the problems of high time complexity, large storage space requirements and low accuracy when traditional clustering algorithm cluster stream data, this paper proposed a kind of stream data clustering algorithm based on differential sampling. First, it used the differential sampling method sampled stream data, and used sample points to construct kernel matrix. Then it used kernel fuzzy C-means clustering algorithm clustered the data points in the kernel matrix, obtained a marked sample kernel matrix. Finally, using the marked kernel matrix divided the stream data. Meanwhile, this paper adopted the fading cluster mechanism to update kernel matrix in real time. Experimental results show that compared with the traditional clustering algorithm, the proposed algorithm achieves lower time complexity, real-time clustering at the same time, get the ideal clustering result.
孙梦冉、邱云飞
计算技术、计算机技术
差异性采样衰退聚类机制核模糊C均值流数据时间复杂度
孙梦冉,邱云飞.基于差异性采样的流数据聚类算法[EB/OL].(2018-04-12)[2025-08-03].https://chinaxiv.org/abs/201804.01439.点此复制
评论