Accurate chromosome-scale haplotype-resolved assembly of human genomes
Accurate chromosome-scale haplotype-resolved assembly of human genomes
Haplotype-resolved or phased sequence assembly provides a complete picture of genomes and complex genetic variations. However, current phased assembly algorithms either fail to generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method that leverages long accurate reads and long-range conformation data for single individuals to generate chromosome-scale phased assembly within a day. Applied to three public human genomes, PGP1, HG002 and NA12878, our method produced haplotype-resolved assemblies with contig NG50 up to 25 Mb and phased ~99.5% of heterozygous sites to 98–99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies to discover structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as HLA and KIR. Our improved method will enable high-quality precision medicine and facilitate new studies of individual haplotype variation and population diversity.
Maguire Jared、Peluso Paul、Chou Mike、Ghurye Jay、Cheng Haoyu、Heller David、Zook Justin M.、Marschall Tobias、Sedlazeck Fritz J.、Aach John、Fungtammasan Arkarachai、Mahmoud Medhat、Hatas Emily、Garg Shilpa、Carroll Andrew、Moemke Tobias、Li Heng、Chin Chen-Shan、Church George M.、Mac Stephen、Schmitt Anthony、Zhou Xiang
Dovetail GenomicsPacific BiosciencesDepartment of Genetics, Harvard Medical SchoolDovetail GenomicsDepartment of Data Sciences, Dana-Farber Cancer Institute||Department of Biomedical Informatics, Harvard Medical SchoolMax Planck Institute for Molecular GeneticsMaterial Measurement Laboratory, National Institute of Standards and TechnologyMax Planck Institute for Informatics||Saarland UniversityHuman Genome Sequencing Center, Baylor College of MedicineDepartment of Genetics, Harvard Medical SchoolDNAnexusHuman Genome Sequencing Center, Baylor College of MedicinePacific BiosciencesDepartment of Genetics, Harvard Medical School||Department of Data Sciences, Dana-Farber Cancer Institute||Department of Biomedical Informatics, Harvard Medical SchoolGoogleSaarland UniversityDepartment of Data Sciences, Dana-Farber Cancer Institute||Department of Biomedical Informatics, Harvard Medical SchoolDNAnexusDepartment of Genetics, Harvard Medical SchoolArima GenomicsArima GenomicsArima Genomics
基础医学遗传学生物科学研究方法、生物科学研究技术
Maguire Jared,Peluso Paul,Chou Mike,Ghurye Jay,Cheng Haoyu,Heller David,Zook Justin M.,Marschall Tobias,Sedlazeck Fritz J.,Aach John,Fungtammasan Arkarachai,Mahmoud Medhat,Hatas Emily,Garg Shilpa,Carroll Andrew,Moemke Tobias,Li Heng,Chin Chen-Shan,Church George M.,Mac Stephen,Schmitt Anthony,Zhou Xiang.Accurate chromosome-scale haplotype-resolved assembly of human genomes[EB/OL].(2025-03-28)[2025-08-02].https://www.biorxiv.org/content/10.1101/810341.点此复制
评论