Multiclass Disease Classification from Microbial Whole-Community Metagenomes using Graph Convolutional Neural Networks
Multiclass Disease Classification from Microbial Whole-Community Metagenomes using Graph Convolutional Neural Networks
There is a wealth of information contained within one’s microbiome regarding their physiology and environment, and this is a promising avenue for developing non-invasive diagnostic tools. Here, we utilize 5643 aggregated, annotated whole-community metagenomes from 19 different diseases to implement the first multiclass microbiome disease classifier of this scale. We compared three different machine learning models: random forests, deep neural nets, and a novel graph convolutional architecture which exploits the graph structure of phylogenetic trees as its input. We show that the graph convolutional model outperforms deep neural nets in terms of accuracy (achieving 75% average test-set accuracy), receiver-operator-characteristics (92.1% average AUC), and precision-recall (50% average AUPR). Additionally, the convolutional net’s performance complements that of the random forest, achieving similar accuracy but better receiver-operator-characteristics and lower area under precision-recall. Lastly, we are able to achieve over 90% average top-3 accuracy across all of our models. Together, these results indicate that there are predictive, disease specific signatures across microbiomes which could potentially be used for diagnostic purposes.
Khan Saad、Kelly Libusha
Department of Systems & Computational BiologyDepartment of Systems & Computational Biology||Department of Microbiology & Immunology Albert Einstein College of Medicine
医学研究方法生物科学研究方法、生物科学研究技术微生物学
MicrobiomeMachine learningMetagenomics
Khan Saad,Kelly Libusha.Multiclass Disease Classification from Microbial Whole-Community Metagenomes using Graph Convolutional Neural Networks[EB/OL].(2025-03-28)[2025-05-17].https://www.biorxiv.org/content/10.1101/726901.点此复制
评论