BlastFrost: Fast querying of 100,000s of bacterial genomes in Bifrost graphs
BlastFrost: Fast querying of 100,000s of bacterial genomes in Bifrost graphs
Abstract BlastFrost is a highly efficient method for querying 100,000s of genome assemblies. It builds on Bifrost, a recently developed dynamic data structure for compacted and colored de Bruijn graphs from bacterial genomes. BlastFrost queries a Bifrost data structure for sequences of interest, and extracts local subgraphs, thereby enabling the efficient identification of the presence or absence of individual genes or single nucleotide sequence variants. Here we describe the algorithms and implementation of BlastFrost. We also present two exemplar practical applications. In the first, we determined the presence of the individual genes within the SPI-2 Salmonella pathogenicity island within a collection of 926 representative genomes in minutes. In the second application, we determined the existence of known single nucleotide polymorphisms associated with fluoroquinolone resistance in the genes gyrA, gyrB and parE among 190, 209 Salmonella genomes. BlastFrost is available for download at https://github.com/nluhmann/BlastFrost.
Luhmann Nina、Achtman Mark、Holley Guillaume
Warwick Medical School, University of WarwickWarwick Medical School, University of WarwickFaculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland
生物科学研究方法、生物科学研究技术微生物学分子生物学
Luhmann Nina,Achtman Mark,Holley Guillaume.BlastFrost: Fast querying of 100,000s of bacterial genomes in Bifrost graphs[EB/OL].(2025-03-28)[2025-05-31].https://www.biorxiv.org/content/10.1101/2020.01.21.914168.点此复制
评论