The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation
The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation
Abstract MotivationPangenomes provide novel insights for population and quantitative genetics, genomics, and breeding not available from studying a single reference genome. Instead, a species is better represented by a pangenome or collection of genomes. Unfortunately, managing and using pangenomes for genomically diverse species is computationally and practically challenging. We developed a trellis graph representation anchored to the reference genome that represents most pangenomes well and can be used to impute complete genomes from low density sequence or variant data. ResultsThe Practical Haplotype Graph (PHG) is a pangenome pipeline, database (PostGRES & SQLite), data model (Java, Kotlin, or R), and Breeding API (BrAPI) web service. The PHG has already been able to accurately represent diversity in four major crops including maize, one of the most genomically diverse species, with up to 1000-fold data compression. Using simulated data, we show that, at even 0.1X coverage, with appropriate reads and sequence alignment, imputation results in extremely accurate haplotype reconstruction. The PHG is a platform and environment for the understanding and application of genomic diversity. AvailabilityAll resources listed here are freely available. The PHG Docker used to generate the simulation results is https://hub.docker.com/ as maizegenetics/phg:0.0.27. PHG source code is at https://bitbucket.org/bucklerlab/practicalhaplotypegraph/src/master/. The code used for the analysis of simulated data is at https://bitbucket.org/bucklerlab/phg-manuscript/src/master/. The PHG database of NAM parent haplotypes is in the CyVerse data store (https://de.cyverse.org/de/) and named /iplant/home/shared/panzea/panGenome/PHG_db_maize/phg_v5Assemblies_20200608.db. Contactpjb39@cornell.edu
Casstevens T、Monier B、Song B、Bradbury PJ、Buckler ES、Miller ZR、Jensen SE、Romay MC、Johnson LC
Institute for Genomic Diversity, Cornell UniversityInstitute for Genomic Diversity, Cornell UniversityInstitute for Genomic Diversity, Cornell UniversityUnited States Department of Agriculture-Agricultural Research Service, Robert W. Holley CenterUnited States Department of Agriculture-Agricultural Research Service, Robert W. Holley Center||Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University||Institute for Genomic Diversity, Cornell UniversityInstitute for Genomic Diversity, Cornell UniversityPlant Breeding and Genetics Section, School of Integrative Plant Science, Cornell UniversityInstitute for Genomic Diversity, Cornell UniversityInstitute for Genomic Diversity, Cornell University
遗传学生物科学研究方法、生物科学研究技术农作物
Casstevens T,Monier B,Song B,Bradbury PJ,Buckler ES,Miller ZR,Jensen SE,Romay MC,Johnson LC.The Practical Haplotype Graph, a platform for storing and using pangenomes for imputation[EB/OL].(2025-03-28)[2025-05-07].https://www.biorxiv.org/content/10.1101/2021.08.27.457652.点此复制
评论