CrowdGO: machine learning and semantic similarity guided consensus Gene Ontology annotation
CrowdGO: machine learning and semantic similarity guided consensus Gene Ontology annotation
Abstract Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. We present CrowdGO, a consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community’s best performing methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations. CrowdGO is implemented in Python3, and is freely available from https://gitlab.com/mreijnders/CrowdGO, along with a Snakemake workflow and pre-trained models. Contactmaarten.reijnders@unil.ch (MJMFR), robert.waterhouse@unil.ch (RMW).
Waterhouse Robert M.、Reijnders Maarten J.M.F.
Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of BioinformaticsDepartment of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics
生物科学研究方法、生物科学研究技术计算技术、计算机技术分子生物学
Waterhouse Robert M.,Reijnders Maarten J.M.F..CrowdGO: machine learning and semantic similarity guided consensus Gene Ontology annotation[EB/OL].(2025-03-28)[2025-04-27].https://www.biorxiv.org/content/10.1101/731596.点此复制
评论