|国家预印本平台
| 注册
首页|缺失观测下spiked样本协方差矩阵的极限特征结构

缺失观测下spiked样本协方差矩阵的极限特征结构

程浩天 李慧琴 尹燕青 张志翔

缺失观测下spiked样本协方差矩阵的极限特征结构

Limiting Eigen-Structure of Spiked Sample Covariance Matrices under Missing Observations

程浩天 1李慧琴 2尹燕青 2张志翔3

作者信息

  • 1. 重庆大学数学与统计学院,重庆 400000
  • 2. 南京审计大学统计与数据科学学院,南京 210000
  • 3. 澳门大学大学数学系,澳门 999078
  • 折叠

摘要

高维主成分分析(PCA)已成为现代数据分析中的核心工具,可实现数据降维与特征提取。然而,缺失数据的存在会带来严峻挑战,不仅会影响 PCA 的性能表现,还会使统计推断过程变得复杂。本文借助随机矩阵理论的最新进展,研究了存在缺失观测的 Spiked 总体模型下 PCA 的渐近行为。我们证明了:Spiked 样本特征值满足渐近正态性,但其极限参数与完整数据场景下的极限参数存在显著差异,这反映出缺失数据机制带来的不可忽视的影响。作为本文研究结果的应用,我们提出了一种检验方法,用于评估 Spiked 总体的独立结构。

Abstract

High-dimensional Principal Component Analysis (PCA) has become an essential tool in modern data analysis, offering dimensionality reduction and feature extraction. However, the presence of missing data introduces significant challenges, distorting the performance of PCA and complicating statistical inference. In this paper, we study the asymptotic behavior of PCA under a spiked population model with missing observations, leveraging recent advances in random matrix theory. We demonstrate that while the spiked sample eigenvalues exhibit asymptotic normality, the limiting parameters differ substantially from those in the complete data case, reflecting the non-trivial influence of the missing data mechanism. As an application of our results, we propose a test to evaluate the independent structure of a spiked population.

关键词

高维数据/样本协方差矩阵/Spiked 特征值与特征向量/缺失数据/中心极限定理

Key words

high-dimensional/ sample covariance matrix/ spiked eigenvalues and eigenvectors/ missing data/ central limit theorem

引用本文复制引用

程浩天,李慧琴,尹燕青,张志翔.缺失观测下spiked样本协方差矩阵的极限特征结构[EB/OL].(2026-04-01)[2026-04-04].http://www.paper.edu.cn/releasepaper/content/202604-11.

学科分类

数学

评论

首发时间 2026-04-01
下载量:0
|
点击量:13
段落导航相关论文