|国家预印本平台
首页|GoldRush: A de novo long read genome assembler with linear time complexity

GoldRush: A de novo long read genome assembler with linear time complexity

GoldRush: A de novo long read genome assembler with linear time complexity

来源:bioRxiv_logobioRxiv
英文摘要

Motivation: Current state-of-the-art long read de novo genome assemblers follow the Overlap Layout Consensus (OLC) paradigm, an O(n2) algorithm in its na?ve implementation. While the most time- and memory-intensive step of OLC —the all-vs-all sequencing read alignment process— was improved and reimplemented in modern long read assemblers, these tools still often require excessive computational memory when assembling a typical 50X human genome dataset. Results: Here we present GoldRush, a de novo genome assembly algorithm with linear time complexity in the number of input long sequencing reads. We tested GoldRush on Oxford Nanopore Technologies datasets with different base error profiles describing the genomes of three human cell lines (NA24385, HG01243 and HG02055), Oryza sativa (rice), and Solanum lycopersicum (tomato). GoldRush achieved NGA50 lengths of 18.3-22.2 Mbp for the three human datasets, with two of the three assemblies having the fewest extensive misassemblies, and NGA50 lengths of 0.3 and 2.6 Mbp for the 373 Mbp and 824 Mbp genomes of rice and tomato, respectively. Further, GoldRush assembled all genomes within a day, using at most 54.5 GB of RAM. These results demonstrate that our algorithm and new assembly paradigm can be used to assemble large genomes de novo efficiently in compute memory space, with resulting assembly contiguity comparable to that of state-of-the-art OLC genome assemblers.

Nikolic Vladimir、Birol Inanc、Wong Johnathan、Zhang Emily、Warren Ren¨| L、Sidhu Puneet、Coombe Lauren、Nip Ka Ming

10.1101/2022.10.25.513734

生物科学研究方法、生物科学研究技术计算技术、计算机技术遗传学

Nikolic Vladimir,Birol Inanc,Wong Johnathan,Zhang Emily,Warren Ren¨| L,Sidhu Puneet,Coombe Lauren,Nip Ka Ming.GoldRush: A de novo long read genome assembler with linear time complexity[EB/OL].(2025-03-28)[2025-05-14].https://www.biorxiv.org/content/10.1101/2022.10.25.513734.点此复制

评论