|国家预印本平台
首页|Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries

Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries

Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries

来源:bioRxiv_logobioRxiv
英文摘要

ABSTRACT Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.

Montgomery Courtney G.、Musani Solomon、Peloso Gina M.、Qiao Dandi、Smith Nicholas L.、Kooperberg Charles、Manichaikul Ani W.、Mathias Rasika A.、Montasser May E.、de Andrade Mariza、Barnes Kathleen C.、Scott Laura J.、Smith Albert V.、Boehnke Michael、Kang Hyun Min、Blangero John、Burchard Esteban G.、Cupples L. Adrienne、Eng Celeste、Guo Xiuqing、Barnard John、Kwong Alan M.、Shoemaker M. Benjamin、Conomos Matthew P.、Smith Jennifer A.、Tiwari Hemant K.、Kelly Tanika N.、Cade Brian E.、Palmer Nicholette D.、Roden Dan M.、Lubitz Steven A.、Mak Angel C. Y.、Kim Wonji、Abecasis Gon?alo R.、Su Jessica Lasky、LeFaive Jonathon、Chen Han、NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium、Irvin Marguerite Ryan、Ellinor Patrick T.、Gao Yan、Reiner Alexander P.、TOPMed Analysis Working Group、Weiss Scott T.、Boerwinkle Eric、Chasman Daniel I.、Weeks Daniel E.、Blackwell Thomas W.

Sarcoidosis Research Unit, Genes and Human Disease Research Program, and Quantitative Analysis Core, Oklahoma Medical Research FoundationJackson Heart Study, University of Mississippi Medical CenterDepartment of Biostatistics, Boston University School of Public HealthChanning Division of Network Medicine, Department of Medicine, Brigham and Women?ˉs Hospital and Harvard Medical SchoolDepartment of Epidemiology, University of Washington||Kaiser Permanente Washington Health Research Institute, Kaiser Permanente Washington||Seattle Epidemiologic Research and Information Center, Office of Research and Development, Department of Veterans AffairsFred Hutchinson Cancer Research CenterCenter for Public Health Genomics, Department of Public Health Sciences, University of VirginiaGeneSTAR Research Program and Division of Allergy and Clinical Immunology, Department of Medicine, Johns Hopkins UniversityDivision of Endocrinology, Diabetes and Nutrition, Department of Medicine, University of Maryland School of MedicineMayo ClinicDepartment of Medicine, Anschultz Medical Campus, University of ColoradoDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganDepartment of Human Genetics and South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley School of MedicineDepartment of Bioengineering and Therapeutic Sciences, University of California San Francisco||Department of Medicine, University of California San FranciscoDepartment of Biostatistics, Boston University School of Public Health||Framingham Heart StudyDepartment of Medicine, University of California San FranciscoThe Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute at Harbor-UCLA Medical CenterDepartment of Quantitative Health Sciences, Lerner Research Institute, Cleveland ClinicDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganDepartment of Medicine, Vanderbilt University Medical CenterDepartment of Biostatistics, University of WashingtonDepartment of Epidemiology, School of Public Health, University of MichiganDepartment of Biostatistics, School of Public Health, University of Alabama at BirminghamDepartment of Epidemiology, Tulane UniversityDivision of Sleep and Circadian Disorders, Brigham and Women?ˉs Hospital||Division of Sleep Medicine, Harvard Medical SchoolDepartment of Biochemistry, Wake Forest School of MedicineDepartments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical CenterCardiovascular Research Center, Massachusetts General Hospital||Cardiovascular Disease Initiative, The Broad Institute of MIT and HarvardDepartment of Medicine, University of California San FranciscoChanning Division of Network Medicine, Department of Medicine, Brigham and Women?ˉs Hospital and Harvard Medical SchoolDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganChanning Division of Network Medicine, Department of Medicine, Brigham and Women?ˉs Hospital and Harvard Medical SchoolDepartment of Biostatistics and Center for Statistical Genetics, University of MichiganHuman Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston||Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at HoustonDepartment of Epidemiology, School of Public Health, University of Alabama at BirminghamCardiovascular Research Center, Massachusetts General Hospital||Cardiovascular Disease Initiative, The Broad Institute of MIT and HarvardDepartment of Physiology and Biophysics, University of Mississippi Medical CenterFred Hutchinson Cancer Research CenterChanning Division of Network Medicine, Department of Medicine, Brigham and Women?ˉs Hospital and Harvard Medical SchoolHuman Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston||Human Genome Sequencing Center, Baylor College of MedicineDivision of Preventive Medicine, Brigham and Women?ˉs HospitalDepartments of Human Genetics and Biostatistics, Graduate School of Public Health, University of PittsburghDepartment of Biostatistics and Center for Statistical Genetics, University of Michigan

10.1101/2020.06.23.167759

遗传学生物科学研究方法、生物科学研究技术基础医学

population structureprincipal components analysisnext-generation sequencinggenotype likelihoods

Montgomery Courtney G.,Musani Solomon,Peloso Gina M.,Qiao Dandi,Smith Nicholas L.,Kooperberg Charles,Manichaikul Ani W.,Mathias Rasika A.,Montasser May E.,de Andrade Mariza,Barnes Kathleen C.,Scott Laura J.,Smith Albert V.,Boehnke Michael,Kang Hyun Min,Blangero John,Burchard Esteban G.,Cupples L. Adrienne,Eng Celeste,Guo Xiuqing,Barnard John,Kwong Alan M.,Shoemaker M. Benjamin,Conomos Matthew P.,Smith Jennifer A.,Tiwari Hemant K.,Kelly Tanika N.,Cade Brian E.,Palmer Nicholette D.,Roden Dan M.,Lubitz Steven A.,Mak Angel C. Y.,Kim Wonji,Abecasis Gon?alo R.,Su Jessica Lasky,LeFaive Jonathon,Chen Han,NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium,Irvin Marguerite Ryan,Ellinor Patrick T.,Gao Yan,Reiner Alexander P.,TOPMed Analysis Working Group,Weiss Scott T.,Boerwinkle Eric,Chasman Daniel I.,Weeks Daniel E.,Blackwell Thomas W..Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries[EB/OL].(2025-03-28)[2025-06-13].https://www.biorxiv.org/content/10.1101/2020.06.23.167759.点此复制

评论