|国家预印本平台
首页|Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile

Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile

Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile

来源:bioRxiv_logobioRxiv
英文摘要

Abstract This paper describes a method Pprint2, which is an improved version of Pprint developed for predicting RNA-interacting residues in a protein. Training and validation datasets used in this study comprises of 545 and 161 non-redundant RNA-binding proteins, respectively. All models were trained on training dataset and evaluated on the validation dataset. The preliminary analysis reveals that positively charged amino acids such as H, R, and K, are more prominent in the RNA-interacting residues. Initially, machine learning based models have been developed using binary profile and obtain maximum area under curve (AUC) 0.68 on validation dataset. The performance of this model improved significantly from AUC 0.68 to 0.76 when evolutionary profile is used instead of binary profile. The performance of our evolutionary profile based model improved further from AUC 0.76 to 0.82, when convolutional neural network has been used for developing model. Our final model based on convolutional neural network using evolutionary information achieved AUC 0.82 with MCC of 0.49 on the validation dataset. Our best model outperform existing methods when evaluated on the validation dataset. A user-friendly standalone software and web based server named “Pprint2” has been developed for predicting RNA-interacting residues (https://webs.iiitd.edu.in/raghava/pprint2 and https://github.com/raghavagps/pprint2) Key PointsMachine learning based models were developed using different profilesPSSM profile of a protein was created to extract evolutionary informationPSSM profiles of proteins were generated using PSI-BLASTConvolutional neural network based model was developed using PSSM profileWebserver, Python- and Perl-based standalone package, and GitHub is available Author’s BiographySumeet Patiyal is currently working as Ph.D. in Computational Biology from Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.Anjali Dhall is currently working as Ph.D. in Computational Biology from Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.Khushboo Bajaj is currently working as MTech in Computer Science and Engineering from Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, New Delhi, India.Harshita Sahu is currently working as MTech in Computer Science and Engineering from Department of Computer Science and Engineering, Indraprastha Institute of Information Technology, New Delhi, India.Gajendra P. S. Raghava is currently working as Professor and Head of Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.

Dhall Anjali、Bajaj Khushboo、Sahu Harshita、Raghava Gajendra P.S.、Patiyal Sumeet

Department of Computational Biology, Indraprastha Institute of Information TechnologyDepartment of Computer Science and Engineering, Indraprastha Institute of Information TechnologyDepartment of Computer Science and Engineering, Indraprastha Institute of Information TechnologyDepartment of Computational Biology, Indraprastha Institute of Information TechnologyDepartment of Computational Biology, Indraprastha Institute of Information Technology

10.1101/2022.06.03.494705

分子生物学生物科学研究方法、生物科学研究技术计算技术、计算机技术

RNA-interacting residuesBinary profileEvolutionary profileConvolutional neural networkMachine learning techniques

Dhall Anjali,Bajaj Khushboo,Sahu Harshita,Raghava Gajendra P.S.,Patiyal Sumeet.Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile[EB/OL].(2025-03-28)[2025-04-27].https://www.biorxiv.org/content/10.1101/2022.06.03.494705.点此复制

评论