一种自适应优化相异性度量的基于MST的半监督聚类方法
n MST-based Semi-supervised Clustering Method of Optimizing Dissimilarity Measure Adaptively
针对混合属性空间中的具有同一分布特性的带类别标记的小样本集和无类别标记的大样本数据集,提出了一种自适应优化相异性度量的基于MST的半监督聚类方法。该方法首先采用决策树方法来获取小样本集的"规则聚类区域",然后采用"异类聚类相离,同类聚类相近"的原则自适应优化建构在该混合属性空间中的相异性度量,接着将优化后的相异性度量应用于基于MST的聚类算法中以获得更为有效的聚类结果。仿真实验结果表明,该方法对有些数据集是有改进效果的。为进一步推广并在实际中发掘出该方法的应用价值,最后给出了两点较有价值的研究展望。
It presents an MST-based semi-supervised clustering method of optimizing dissimilarity measure adaptively, when clustering an unlabelled data set which has the same or a similar distribution with a labelled sample in one hybrid attributes space. At first, we can obtain "regular cluster regions" by partitioning the labelled sample using one decision-tree method. Then optimize the dissimilarity measure of the hybrid attributes space adaptively based on the principia, "data points in the same clusters are close to each other". At last, the optimized dissimilarity measure is applied to an MST-based clustering method. This semi-supervised clustering method can often get a better clustering quality validated by some simulated experiments of several UCI data sets. In the end, it gives two research expectations in order to disinter and popularize this method.
陈新泉
计算技术、计算机技术
模式识别相异性度量自适应优化半监督聚类混合属性
issimilarity MeasureAdaptive OptimizationSemi-supervised ClusteringHybrid Attributes
陈新泉.一种自适应优化相异性度量的基于MST的半监督聚类方法[EB/OL].(2011-04-18)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/201104-403.点此复制
评论