|国家预印本平台
首页|PSAURON: a tool for assessing protein annotation across a broad range of species

PSAURON: a tool for assessing protein annotation across a broad range of species

PSAURON: a tool for assessing protein annotation across a broad range of species

来源:bioRxiv_logobioRxiv
英文摘要

Abstract Evaluating the accuracy of protein-coding sequences in genome annotations is a challenging problem for which there is no broadly applicable solution. In this manuscript we introduce PSAURON (Protein Sequence Assessment Using a Reference ORF Network), a novel software tool developed to assess the quality of protein-coding gene annotations. Utilizing a machine learning model trained on a diverse dataset from over 1000 plant and animal genomes, PSAURON assigns a score to coding DNA or protein sequence that reflects the likelihood that the sequence is a genuine protein coding region. PSAURON scores can be used for genome-wide protein annotation assessment as well as the rapid identification of potentially spurious annotated proteins. Validation against established benchmarks demonstrates PSAURON’s effectiveness and correlation with recognized measures of protein quality, highlighting its potential use as a general-purpose method to evaluate gene annotation. PSAURON is open source and freely available at https://github.com/salzberg-lab/PSAURON. One-Sentence SummaryPSAURON is a machine learning-based tool for rapid assessment of protein coding gene annotation.

Salzberg Steven L.、Sommer Markus J.、Zimin Aleksey V.

Department of Biomedical Engineering, Johns Hopkins University||Center for Computational Biology, Johns Hopkins University||Department of Computer Science, Johns Hopkins University||Department of Biostatistics, Johns Hopkins UniversityDepartment of Biomedical Engineering, Johns Hopkins University||Center for Computational Biology, Johns Hopkins UniversityDepartment of Biomedical Engineering, Johns Hopkins University||Center for Computational Biology, Johns Hopkins University

10.1101/2024.05.15.594385

生物科学研究方法、生物科学研究技术生物工程学分子生物学

Salzberg Steven L.,Sommer Markus J.,Zimin Aleksey V..PSAURON: a tool for assessing protein annotation across a broad range of species[EB/OL].(2025-03-28)[2025-06-05].https://www.biorxiv.org/content/10.1101/2024.05.15.594385.点此复制

评论