Information Enhanced Model Selection for Gaussian Graphical Model with Application to Metabolomic Data
Information Enhanced Model Selection for Gaussian Graphical Model with Application to Metabolomic Data
Abstract In light of the low signal-to-noise nature of many large biological data sets, we propose a novel method to learn the structure of association networks using Gaussian graphical models combined with prior knowledge. Our strategy includes two parts. In the first part, we propose a model selection criterion called structural Bayesian information criterion (SBIC), in which the prior structure is modeled and incorporated into Bayesian information criterion (BIC). It is shown that the popular extended BIC (EBIC) is a special case of SBIC. In the second part, we propose a two-step algorithm to construct the candidate model pool. The algorithm is data-driven and the prior structure is embedded into the candidate model automatically. Theoretical investigation shows that under some mild conditions SBIC is a consistent model selection criterion for high-dimensional Gaussian graphical model. Simulation studies validate the superiority of the proposed algorithm over the existing ones and show the robustness to the model misspecification. Application to relative concentration data from infant feces collected from subjects enrolled in a large molecular epidemiological cohort study validates that metabolic pathway involvement is a statistically significant factor for the conditional dependence between metabolites. Furthermore, new relationships among metabolites are discovered which can not be identified by the conventional methods of pathway analysis. Some of them have been widely recognized in biological literature.
Hoen Anne G.、McRitchie Susan、Viles Weston D.、Nguyen Quang P.、Gui Jiang、Pathmasiri Wimal、Karagas Margaret R.、Zhou Jie、Dade Erika、Madan Juliette C.
Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College||Depatment of Epidemiology, Geisel School of Medicine, Dartmouth CollegeNutrition Research Institute, University of North CarolinaDepartment of Mathematics and Statistics, University of Southern MaineDepartment of Biomedical Data Science, Geisel School of Medicine, Dartmouth College||Depatment of Epidemiology, Geisel School of Medicine, Dartmouth CollegeDepartment of Biomedical Data Science, Geisel School of Medicine, Dartmouth CollegeNutrition Research Institute, University of North CarolinaDepatment of Epidemiology, Geisel School of Medicine, Dartmouth CollegeDepartment of Biomedical Data Science, Geisel School of Medicine, Dartmouth CollegeDepatment of Epidemiology, Geisel School of Medicine, Dartmouth CollegeDepatment of Epidemiology, Geisel School of Medicine, Dartmouth College
生物科学研究方法、生物科学研究技术生物化学生物物理学
Hoen Anne G.,McRitchie Susan,Viles Weston D.,Nguyen Quang P.,Gui Jiang,Pathmasiri Wimal,Karagas Margaret R.,Zhou Jie,Dade Erika,Madan Juliette C..Information Enhanced Model Selection for Gaussian Graphical Model with Application to Metabolomic Data[EB/OL].(2025-03-28)[2025-05-17].https://www.biorxiv.org/content/10.1101/815423.点此复制
评论