Zero-Shot Learning with Subsequence Reordering Pretraining for Compound-Protein Interaction
Zero-Shot Learning with Subsequence Reordering Pretraining for Compound-Protein Interaction
Given the vastness of chemical space and the ongoing emergence of previously uncharacterized proteins, zero-shot compound-protein interaction (CPI) prediction better reflects the practical challenges and requirements of real-world drug development. Although existing methods perform adequately during certain CPI tasks, they still face the following challenges: (1) Representation learning from local or complete protein sequences often overlooks the complex interdependencies between subsequences, which are essential for predicting spatial structures and binding properties. (2) Dependence on large-scale or scarce multimodal protein datasets demands significant training data and computational resources, limiting scalability and efficiency. To address these challenges, we propose a novel approach that pretrains protein representations for CPI prediction tasks using subsequence reordering, explicitly capturing the dependencies between protein subsequences. Furthermore, we apply length-variable protein augmentation to ensure excellent pretraining performance on small training datasets. To evaluate the model's effectiveness and zero-shot learning ability, we combine it with various baseline methods. The results demonstrate that our approach can improve the baseline model's performance on the CPI task, especially in the challenging zero-shot scenario. Compared to existing pre-training models, our model demonstrates superior performance, particularly in data-scarce scenarios where training samples are limited. Our implementation is available at https://github.com/Hoch-Zhang/PSRP-CPI.
Hongzhi Zhang、Zhonglie Liu、Kun Meng、Jiameng Chen、Jia Wu、Bo Du、Di Lin、Yan Che、Wenbin Hu
分子生物学生物化学生物物理学
Hongzhi Zhang,Zhonglie Liu,Kun Meng,Jiameng Chen,Jia Wu,Bo Du,Di Lin,Yan Che,Wenbin Hu.Zero-Shot Learning with Subsequence Reordering Pretraining for Compound-Protein Interaction[EB/OL].(2025-07-28)[2025-08-10].https://arxiv.org/abs/2507.20925.点此复制
评论