|国家预印本平台
首页|ODNA: Identification of Organellar DNA by Machine Learning

ODNA: Identification of Organellar DNA by Machine Learning

ODNA: Identification of Organellar DNA by Machine Learning

来源:bioRxiv_logobioRxiv
英文摘要

Motivation: Identifying organellar DNA, such as mitochondrial or plastid sequences, inside a whole genome assembly, remains challenging and requires biological background knowledge. To address this, we developed ODNA based on genome annotation and machine learning to fulfill. Results: ODNA is a software that performs organellar DNA sequence classification of a genome assembly by machine learning based on a pre-defined genome annotation workflow. We trained our model with 829,769 DNA sequences from 405 genome assemblies and achieved very high predictive performance (e.g., MCC of 0.61) on independent validation data, thus outperforming existing approaches significantly. Availability: Our software ODNA is freely accessible as a web service at https://odna.mathematik.uni-marburg.de and can also be run in a docker container. The source code can be found at https://gitlab.com/mosga/odna and the processed data at Zenodo.

Nguyen Minh Kien、Heider Dominik、Lowack Nick、Martin Roman

10.1101/2023.01.10.523051

生物科学研究方法、生物科学研究技术计算技术、计算机技术分子生物学

Nguyen Minh Kien,Heider Dominik,Lowack Nick,Martin Roman.ODNA: Identification of Organellar DNA by Machine Learning[EB/OL].(2025-03-28)[2025-05-17].https://www.biorxiv.org/content/10.1101/2023.01.10.523051.点此复制

评论