|国家预印本平台
首页|K-Means Panel Data Clustering in the Presence of Small Groups

K-Means Panel Data Clustering in the Presence of Small Groups

K-Means Panel Data Clustering in the Presence of Small Groups

来源:Arxiv_logoArxiv
英文摘要

We consider panel data models with group structure. We study the asymptotic behavior of least-squares estimators and information criterion for the number of groups, allowing for the presence of small groups that have an asymptotically negligible relative size. Our contributions are threefold. First, we derive sufficient conditions under which the least-squares estimators are consistent and asymptotically normal. One of the conditions implies that a longer sample period is required as there are smaller groups. Second, we show that information criteria for the number of groups proposed in earlier works can be inconsistent or perform poorly in the presence of small groups. Third, we propose modified information criteria (MIC) designed to perform well in the presence of small groups. A Monte Carlo simulation confirms their good performance in finite samples. An empirical application illustrates that K-means clustering paired with the proposed MIC allows one to discover small groups without producing too many groups. This enables characterizing small groups and differentiating them from the other large groups in a parsimonious group structure.

Mikihito Nishi

计算技术、计算机技术

Mikihito Nishi.K-Means Panel Data Clustering in the Presence of Small Groups[EB/OL].(2025-08-21)[2025-09-02].https://arxiv.org/abs/2508.15408.点此复制

评论