The Naive Bayes Classifier++ for Metagenomic Taxonomic Classification -- Query Evaluation
The Naive Bayes Classifier++ for Metagenomic Taxonomic Classification -- Query Evaluation
This study examines the query performance of the NBC++ (Incremental Naive Bayes Classifier) program for variations in canonicality, kmer size, databases, and input sample data size. NBC++ can successfully assess a wide range of superkingdoms using a small training database. We demonstrate that NBC++ and Kraken2 are affected by database depth with macro measures increasing with depth but that the full diversity of life, especially viruses, is still a challenge for these classifiers. NBC++ spends less time training but at the cost of long querying time. The major enhancements are to accommodate canonical $k$mer storage (with major storage savings), adaptable and optimized memory allocation that quickens the query analysis and allows the classifier to be run on almost any system, and enables output of the log-likelihood values against each training genome which provides users with valualbe confidence information.
Rosen Gail L、Duan Haozhe Neil、Polikar Robi、Hearne Gavin
生物科学研究方法、生物科学研究技术计算技术、计算机技术微生物学
Rosen Gail L,Duan Haozhe Neil,Polikar Robi,Hearne Gavin.The Naive Bayes Classifier++ for Metagenomic Taxonomic Classification -- Query Evaluation[EB/OL].(2025-03-28)[2025-04-30].https://www.biorxiv.org/content/10.1101/2024.06.25.600711.点此复制
评论