|国家预印本平台
首页|pproximation Algorithms for K-Modes Clustering

pproximation Algorithms for K-Modes Clustering

pproximation Algorithms for K-Modes Clustering

中文摘要英文摘要

In this paper, we study clustering with respect to the k-modes objective function, a natural formulation of clustering for categorical data. One of the main contributions of this paper is to establish the connection between k-modes and k-median, i.e., the optimum of k-median is at most twice the optimum of k-modes for the same categorical data clustering problem. Based on this observation, we derive a deterministic algorithm that achieves an approximation factor of 2. Furthermore, we prove that the distance measure in k-modes defines a metric. Hence, we are able to extend existing approximation algorithms for metric k-median to k-modes. Empirical results verify the superiority of our method.

In this paper, we study clustering with respect to the k-modes objective function, a natural formulation of clustering for categorical data. One of the main contributions of this paper is to establish the connection between k-modes and k-median, i.e., the optimum of k-median is at most twice the optimum of k-modes for the same categorical data clustering problem. Based on this observation, we derive a deterministic algorithm that achieves an approximation factor of 2. Furthermore, we prove that the distance measure in k-modes defines a metric. Hence, we are able to extend existing approximation algorithms for metric k-median to k-modes. Empirical results verify the superiority of our method.

何增友

计算技术、计算机技术

lustering Categorical Data K-Means K-Modes K-Median Data Mining

lustering Categorical Data K-Means K-Modes K-Median Data Mining

何增友.pproximation Algorithms for K-Modes Clustering[EB/OL].(2006-03-31)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/200603-567.点此复制

评论