metaDMG - A Fast and Accurate Ancient DNA Damage Toolkit for Metagenomic Data
metaDMG - A Fast and Accurate Ancient DNA Damage Toolkit for Metagenomic Data
Motivation: Under favourable conditions DNA molecules can persist for hundreds of thousands of years. Such genetic remains make up invaluable resources to study past assemblages, populations, and even the evolution of species. However, DNA is subject to degradation, and hence over time decrease to ultra low concentrations which makes it highly prone to contamination by modern sources. Strict precautions are therefore necessary to ensure that DNA from modern sources does not appear in the final data is authenticated as ancient. The most generally accepted and widely applied authenticity for ancient DNA studies is to test for elevated deaminated cytosine residues towards the termini of the molecules: DNA damage. To date, this has primarily been used for single organisms and recently for read assemblies, however, these methods are not applicable for estimating DNA damage for ancient metagenomes with tens and even hundreds of thousands of species. Methods: We present metaDMG, a novel framework and toolkit that allows for the estimation, quantification and visualization of postmortem damage for single reads, single genomes and even metagenomic environmental DNA by utilizing the taxonomic branching structure. It bypasses any need for initial classification, splitting reads by individual organisms, and realignment. We have implemented a Bayesian approach that combines a modified geometric damage profile with a beta-binomial model to fit the entire model to the individual misincorporations at all taxonomic levels. Results: We evaluated the performance using both simulated and published environmental DNA datasets and compared to existing methods when relevant. We find \metaDMG to be an order of magnitude faster than previous methods and more accurate -- even for complex metagenomes. Our simulations show that metaDMG can estimate DNA damage at taxonomic levels down to 100 reads, that the estimated uncertainties decrease with increased number of reads and that the estimates are more significant with increased number of C to T misincorporations. Conclusion: metaDMG is a state-of-the-art program for aDNA damage estimation and allows for the computation of nucleotide misincorporation, GC-content, and DNA fragmentation for both simple and complex ancient genomic datasets, making it a complete package for ancient DNA damage authentication.
Michelsen Christian、Petersen Troels Christian、Korneliussen Thorfinn Sand、Fernandez-Guerra Antonio、Zhao Lei、Pedersen Mikkel Winther
生物科学研究方法、生物科学研究技术古生物学遗传学
Michelsen Christian,Petersen Troels Christian,Korneliussen Thorfinn Sand,Fernandez-Guerra Antonio,Zhao Lei,Pedersen Mikkel Winther.metaDMG - A Fast and Accurate Ancient DNA Damage Toolkit for Metagenomic Data[EB/OL].(2025-03-28)[2025-07-23].https://www.biorxiv.org/content/10.1101/2022.12.06.519264.点此复制
评论