|国家预印本平台
首页|Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts

Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts

Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts

来源:bioRxiv_logobioRxiv
英文摘要

Abstract With very large sample sizes, population-based cohorts and biobanks provide an exciting opportunity to identify genetic components of complex traits. To analyze rare variants, gene or region-based multiple variant aggregate tests are commonly used to increase association test power. However, due to the substantial computation cost, existing region-based rare variant tests cannot analyze hundreds of thousands of samples while accounting for confounders, such as population stratification and sample relatedness. Here we propose a scalable generalized mixed model region-based association test that can handle large sample sizes and accounts for unbalanced case-control ratios for binary traits. This method, SAIGE-GENE, utilizes state-of-the-art optimization strategies to reduce computational and memory cost, and hence is applicable to exome-wide and genome-wide region-based analysis for hundreds of thousands of samples. Through the analysis of the HUNT study of 69,716 Norwegian samples and the UK Biobank data of 408,910 White British samples, we show that SAIGE-GENE can efficiently analyze large sample data (N > 400,000) with type I error rates well controlled.

Lee Seunggeun、Zhou Wei、Fritsche Lars G.、LeFaive Jonathon、Bi Wenjian、Gabrielsen Maiken E.、Abecasis Goncalo R.、Willer Cristen J.、Nielsen Jonas B.、Gagliano Taliun Sarah A.、Hveem Kristian、Zhao Zhangchen、Daly Mark J.、Neale Benjamin M.

Center for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthCenter for Statistical Genetics, University of Michigan School of Public Health||Analytic and Translational Genetics Unit, Massachusetts General Hospital||Program in Medical and Population Genetics, Broad Institute of Harvard and MIT||Stanley Center for Psychiatric Research, Broad Institute of Harvard and MITCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthK.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and TechnologyCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthDepartment of Internal Medicine, Division of Cardiology, University of Michigan Medical School||Department of Computational Medicine and Bioinformatics, University of Michigan||Department of Human Genetics, University of Michigan Medical SchoolDepartment of Internal Medicine, Division of Cardiology, University of Michigan Medical SchoolCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthK.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and Nursing, NTNU, Norwegian University of Science and Technology||HUNT Research Centre, Department of Public Health and Nursing, Norwegian University of Science and TechnologyCenter for Statistical Genetics, University of Michigan School of Public Health||Department of Biostatistics, University of Michigan School of Public HealthAnalytic and Translational Genetics Unit, Massachusetts General Hospital||Program in Medical and Population Genetics, Broad Institute of Harvard and MIT||Stanley Center for Psychiatric Research, Broad Institute of Harvard and MITAnalytic and Translational Genetics Unit, Massachusetts General Hospital||Program in Medical and Population Genetics, Broad Institute of Harvard and MIT||Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT

10.1101/583278

生物科学研究方法、生物科学研究技术基础医学遗传学

Lee Seunggeun,Zhou Wei,Fritsche Lars G.,LeFaive Jonathon,Bi Wenjian,Gabrielsen Maiken E.,Abecasis Goncalo R.,Willer Cristen J.,Nielsen Jonas B.,Gagliano Taliun Sarah A.,Hveem Kristian,Zhao Zhangchen,Daly Mark J.,Neale Benjamin M..Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts[EB/OL].(2025-03-28)[2025-06-04].https://www.biorxiv.org/content/10.1101/583278.点此复制

评论