The long and short of it: Benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies
The long and short of it: Benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies
Viral metagenomics has fuelled a rapid change in our understanding of global viral diversity and ecology. Long-read sequencing and hybrid approaches that combine long and short read technologies are now being widely implemented in bacterial genomics and metagenomics. However, the use of long-read sequencing to investigate viral communities is still in its infancy. While Nanopore and PacBio technologies have been applied to viral metagenomics, it is not known to what extent different technologies will impact the reconstruction of the viral community. Thus, we constructed a mock phage community of previously sequenced phage genomes and sequenced using Illumina, Nanopore, and PacBio sequencing technologies and tested a number of different assembly approaches. When using a single sequencing technology, Illumina assemblies were the best at recovering phage genomes. Nanopore- and PacBio-only assemblies performed poorly in comparison to Illumina in both genome recovery and error rates, which both varied with the assembler used. The best Nanopore assembly had errors that manifested as SNPs and INDELs at frequencies ~4x and 120x higher than found in Illumina only assemblies respectively. While the best PacBio assemblies had SNPs at frequencies ~3.5 x and 12x higher than found in Illumina only assemblies respectively. Despite high read coverage, long-read only assemblies failed to recover a complete genome for any of the 15 phage, down sampling of reads did increase the proportion of a genome that could be assembled into a single contig. Overall the best approach was assembly by a combination of Illumina and Nanopore reads, which reduced error rates to levels comparable with short read only assemblies. When using a single technology, Illumina only was the best approach. The differences in genome recovery and error rates between technology and assembler had downstream impacts on gene prediction, viral prediction, and subsequent estimates of diversity within a sample. These findings will provide a starting point for others in the choice of reads and assembly algorithms for the analysis of viromes.
Michniewski Slavomir、Rihtman Branko、Redgwell Tamsin、Stekel Dov J、Hobman Jon、Scanlan Dave、Smith Darren、Brown Nathan、Chen Yin、Cook Ryan、Millard Andrew D、Nelson Andrew、Jones Michael A
生物科学研究方法、生物科学研究技术微生物学分子生物学
Michniewski Slavomir,Rihtman Branko,Redgwell Tamsin,Stekel Dov J,Hobman Jon,Scanlan Dave,Smith Darren,Brown Nathan,Chen Yin,Cook Ryan,Millard Andrew D,Nelson Andrew,Jones Michael A.The long and short of it: Benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies[EB/OL].(2025-03-28)[2025-08-07].https://www.biorxiv.org/content/10.1101/2023.02.12.527533.点此复制
评论