|国家预印本平台
首页|基于高维特征筛选的QSAR模型辅助ARC-111抗癌活性设计

基于高维特征筛选的QSAR模型辅助ARC-111抗癌活性设计

omputational QSAR models with high-dimensional descriptor selection improve antitumor activity design of ARC-111 analogues

中文摘要英文摘要

RC-111具抗癌活性。为有效提高ARC-111类似物的药物设计,本文分析22类ARC-111药物在P388癌细胞中的定量构效关系。首先,基于文献数据集(低维)和多轮末尾淘汰法(WDEM)构建并优化支持向量机(SVR)模型,独立测试结果显示,优化的SVR模型较多元线性回归模型(MLR)和逐步线性回归模型(SLR)泛化能力更好,这表明非线性的WDEM方法能更有效去除冗余描述符且优化SVR建模更有效。其次,为鉴定更高效且易得的描述符,利用PCLIENT软件获得大量具有明确生物学意义的描述符。经高维特征非线性筛选法(HDSN)和WDEM法,从2,923个描述符中得到7个描述符组合。SVR建模后的独立测试结果均优于MLR和SLR结果。评估结果表明新模型具有更优秀的预测能力。基于SVR构建的模型解释性系统,进一步评估模型的回归显著性和单因子重要性。本文为ARC-111类似物抗癌活性的提高提供了有价值的参考和药物设计参数。

RC-111 has potent topoisomerase I-targeting activity and pronounced antitumor activity. To design ARC-111 analogues with improved efficiency, we performed analyses on the quantitative structure-activity relationship (QSAR) of 22 ARC-111 analogues assessed in P388 tumor cells. First, the support vector regression (SVR) models were constructed and optimized based on literature descriptors (the low-dimensional descriptor space) and the worst descriptor elimination multi-round (WDEM) method. The optimized SVR model had greater generalization ability than multiple linear regression (MLR) and stepwise linear regression (SLR) in the independence test, which indicated that our nonlinear WDEM method could remove redundant descriptors more effectively, and our optimized SVR was a more powerful modeling technique. Second, to identify more accessible and effective descriptors, our modeling descriptors with clear meanings were selected from a large number of descriptors calculated by the software PCLIENT. Through the high-dimensional descriptor selection nonlinear (HDSN) method and the WDEM method, seven independent variable combinations with tens of descriptors were selected out of 2,923 descriptors. The seven corresponding SVR models performed better in the independent test, compared to MLR and SLR. The evaluation measures supported the excellent predictive power of the new models. According to the interpretability analysis of the SVR model, the regression significance of the model and the importance of single indicator were evaluated based on F-tests. Our work offers some useful theories for understanding the function mechanism and finds parameters for designing ARC-111 analogues with enhanced antitumor activity.

代志军、陈渊、周玮、袁哲明

药学基础医学生物科学研究方法、生物科学研究技术

RC-111类似物P388癌细胞定量构效关系(QSAR)支持向量机(SVR)描述符筛选

RC-111 analoguesP388 tumor cellsQuantitative structure-activity relationship (QSAR)Support vector regression (SVR)Descriptor selection

代志军,陈渊,周玮,袁哲明.基于高维特征筛选的QSAR模型辅助ARC-111抗癌活性设计[EB/OL].(2012-03-02)[2025-07-17].http://www.paper.edu.cn/releasepaper/content/201203-94.点此复制

评论