|国家预印本平台
首页|Expectation pooling: An effective and interpretable pooling method for predicting DNA-protein binding

Expectation pooling: An effective and interpretable pooling method for predicting DNA-protein binding

Expectation pooling: An effective and interpretable pooling method for predicting DNA-protein binding

来源:bioRxiv_logobioRxiv
英文摘要

Abstract MotivationConvolutional neural networks (CNNs) have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. While previous studies have built a connection between CNNs and probabilistic models, simple models of CNNs cannot achieve sufficient accuracy on this problem. Recently, some methods of neural networks have increased performance using complex neural networks whose results cannot be directly interpreted. However, it is difficult to combine probabilistic models and CNNs effectively to improve DNA-protein binding predictions. ResultsIn this paper, we present a novel global pooling method: expectation pooling for predicting DNA-protein binding. Our pooling method stems naturally from the EM algorithm, and its benefits can be interpreted both statistically and via deep learning theory. Through experiments, we demonstrate that our pooling method improves the prediction performance DNA-protein binding. Our interpretable pooling method combines probabilistic ideas with global pooling by taking the expectations of inputs without increasing the number of parameters. We also analyze the hyperparameters in our method and propose optional structures to help fit different datasets. We explore how to effectively utilize these novel pooling methods and show that combining statistical methods with deep learning is highly beneficial, which is promising and meaningful for future studies in this field. Contactdengmh@pku.edu.cn, gaog@mail.cbi.pku.edu.cn Supplementary informationAll code is public in https://github.com/gao-lab/ePooling

Ding Yang、Gao Ge、Deng Minghua、Tu Xinming、Luo Xiao

Biomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking UniversityBiomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking UniversitySchool of Mathematical Sciences, Peking University||Center for Quantitative Biology, Peking UniversityBiomedical Pioneering Innovation Center (BIOPIC), Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), and the State Key Laboratory of Protein and Plant Gene Research at School of Life Sciences, Peking UniversitySchool of Mathematical Sciences, Peking University

10.1101/658427

生物科学研究方法、生物科学研究技术计算技术、计算机技术分子生物学

Ding Yang,Gao Ge,Deng Minghua,Tu Xinming,Luo Xiao.Expectation pooling: An effective and interpretable pooling method for predicting DNA-protein binding[EB/OL].(2025-03-28)[2025-06-04].https://www.biorxiv.org/content/10.1101/658427.点此复制

评论