Supervised Integrative Biclustering with applications to Alzheimer's Disease
Supervised Integrative Biclustering with applications to Alzheimer's Disease
Multiple types or views of data (e.g. genetics, proteomics) measured on the same set of individuals are now popularly generated in many biomedical studies. A particular interest might be the detection of sample subgroups (e.g. subtypes of disease) characterized by specific groups of variables. Biclustering methods are well-suited for this problem since they can group samples and variables simultaneously. However, most existing biclustering methods cannot guarantee that the detected sample clusters are clinically meaningful and related to a clinical outcome because they independently identify biclusters and associate sample clusters with a clinical outcome. Additionally, these methods have been developed for continuous data when integrating data from different views and do not allow for a mixture of data distributions. We propose a new formulation of biclustering and prediction method for multi-view data from different distributions that enhances our ability to identify clinically meaningful biclusters by incorporating a clinical outcome. Sample clusters are defined based on an adaptively chosen subset of variables and their association with a clinical outcome. We use extensive simulations to showcase the effectiveness of our proposed method in comparison to existing methods. Real-world applications using lipidomics, imaging, and cognitive data on Alzheimer's disease(AD) identified biclusters with significant cognitive differences that other methods missed. The distinct lipid categories and brain regions characterizing the biclusters suggest potential new insights into pathology of AD.
Kaifeng Yang、Thierry Chekouo、Sandra E. Safo
医学研究方法生物科学研究方法、生物科学研究技术神经病学、精神病学
Kaifeng Yang,Thierry Chekouo,Sandra E. Safo.Supervised Integrative Biclustering with applications to Alzheimer's Disease[EB/OL].(2025-05-07)[2025-06-03].https://arxiv.org/abs/2505.04830.点此复制
评论