Clustering Tails in High Dimension
Clustering Tails in High Dimension
One potential solution to combat the scarcity of tail observations in extreme value analysis is to integrate information from multiple datasets sharing similar tail properties, for instance, a common extreme value index. In other words, for a multivariate dataset, we intend to group dimensions into clusters first, before applying any pooling techniques. This paper addresses the clustering problem for a high dimensional dataset, according to their extreme value indices. We propose an iterative clustering procedure that sequentially partitions the variables into groups, ordered from the heaviest-tailed to the lightesttailed distributions. At each step, our method identifies and extracts a group of variables that share the highest extreme value index among the remaining ones. This approach differs fundamentally from conventional clustering methods such as using pre-estimated extreme value indices in a two-step clustering method. We show the consistency property of the proposed algorithm and demonstrate its finite-sample performance using a simulation study and a real data application.
Marco Oesting、Chen Zhou、Liujun Chen
数学
Marco Oesting,Chen Zhou,Liujun Chen.Clustering Tails in High Dimension[EB/OL].(2025-06-24)[2025-07-19].https://arxiv.org/abs/2506.19414.点此复制
评论