|国家预印本平台
首页|Nonlinear Sparse Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data

Nonlinear Sparse Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data

Nonlinear Sparse Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data

来源:Arxiv_logoArxiv
英文摘要

Motivation: Biomedical studies increasingly produce multi-view high-dimensional datasets (e.g., multi-omics) that demand integrative analysis. Existing canonical correlation analysis (CCA) and generalized CCA methods address at most two of the following three key aspects simultaneously: (i) nonlinear dependence, (ii) sparsity for variable selection, and (iii) generalization to more than two data views. There is a pressing need for CCA methods that integrate all three aspects to effectively analyze multi-view high-dimensional data. Results: We propose three nonlinear, sparse, generalized CCA methods, HSIC-SGCCA, SA-KGCCA, and TS-KGCCA, for variable selection in multi-view high-dimensional data. These methods extend existing SCCA-HSIC, SA-KCCA, and TS-KCCA from two-view to multi-view settings. While SA-KGCCA and TS-KGCCA yield multi-convex optimization problems solved via block coordinate descent, HSIC-SGCCA introduces a necessary unit-variance constraint previously ignored in SCCA-HSIC, resulting in a nonconvex, non-multiconvex problem. We efficiently address this challenge by integrating the block prox-linear method with the linearized alternating direction method of multipliers. Simulations and TCGA-BRCA data analysis demonstrate that HSIC-SGCCA outperforms competing methods in multi-view variable selection.

Hai Shu、Ziqi Chen、Gen Li、Rong Wu

生物科学研究方法、生物科学研究技术数学

Hai Shu,Ziqi Chen,Gen Li,Rong Wu.Nonlinear Sparse Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data[EB/OL].(2025-02-25)[2025-08-02].https://arxiv.org/abs/2502.18756.点此复制

评论