|国家预印本平台
首页|HiG2Vec: Hierarchical Representations of Gene Ontology and Genes in the Poincaré Ball

HiG2Vec: Hierarchical Representations of Gene Ontology and Genes in the Poincaré Ball

HiG2Vec: Hierarchical Representations of Gene Ontology and Genes in the Poincaré Ball

来源:bioRxiv_logobioRxiv
英文摘要

Abstract Knowledge manipulation of gene ontology (GO) and gene ontology annotation (GOA) can be done primarily by using vector representation of GO terms and genes for versatile applications such as deep learning. Previous studies have represented GO terms and genes or gene products to measure their semantic similarity using the Word2Vec-based method, which is an embedding method to represent entities as numeric vectors in Euclidean space. However, this method has the limitation that embedding large graph-structured data in the Euclidean space cannot prevent a loss of information of latent hierarchies, thus precluding the semantics of GO and GOA from being captured optimally. In this paper, we propose hierarchical representations of GO and genes (HiG2Vec) that apply Poincaré embedding specialized in the representation of hierarchy through a two-step procedure: GO embedding and gene embedding. Through experiments, we show that our model represents the hierarchical structure better than other approaches and predicts the interaction of genes or gene products similar to or better than previous studies. The results indicate that HiG2Vec is superior to other methods in capturing the GO and gene semantics and in data utilization as well. It can be robustly applied to manipulate various biological knowledge. Availabilityhttps://github.com/JaesikKim/HiG2Vec Contactkasohn@ajou.ac.kr, Dokyoon.Kim@pennmedicine.upenn.edu

Sohn Kyung-Ah、Kim Dokyoon、Kim Jaesik

Department of Computer Engineering, Ajou University||Department of Artificial Intelligence, Ajou UniversityDepartment of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania||Institute for Biomedical Informatics, University of PennsylvaniaDepartment of Computer Engineering, Ajou University||Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania||Institute for Biomedical Informatics, University of Pennsylvania

10.1101/2020.07.14.195750

遗传学分子生物学生物科学研究方法、生物科学研究技术

Sohn Kyung-Ah,Kim Dokyoon,Kim Jaesik.HiG2Vec: Hierarchical Representations of Gene Ontology and Genes in the Poincaré Ball[EB/OL].(2025-03-28)[2025-05-04].https://www.biorxiv.org/content/10.1101/2020.07.14.195750.点此复制

评论