Protein folding classes -- High-dimensional geometry of amino acid composition space revisited
Protein folding classes -- High-dimensional geometry of amino acid composition space revisited
In this study, the distributions of protein structure classes (or folding types) of experimentally determined structures from a legacy dataset and a comprehensive database SCOP are modeled precisely with geometric constructs such as convex polytopes in high-dimensional amino acid composition space. This is a follow-up of a previous non-statistical, geometry-motivated modeling of protein classes with ellipsoidal models, which are superseded presently in three important respects: (1) as a paradigm shift descriptive 'distribution model' of experimental data is de-coupled from, and serves as the basis for, possible future predictive 'domain model' generalizable to proteins in the same class for which 3D structures have yet to be determined experimentally, (2) the geometric and analytic characteristics of class distributions are obtained via exact computational geometry calculations, and (3) the full data from a comprehensive database are included in such calculations, eschewing training set selection and biases. In contrast to statistical and machine-learning approaches, the analytical, non-statistical geometry models of protein class distributions demonstrated in this study furnish complete and precise information on their size and relative disposition in the high-dimensional space (vis-\`a-vis any overlaps leading to ambiguity and limits in classification). Intended principally as accurate and summary description of the complex relationships between amino acid composition and protein classes, and suitably as a basis for predictive modeling where permissible, the results suggest that pen-ultimately they may be useful adjuncts for validating sequence-based protein structure predictions and contribute to theoretical and fundamental understanding of secondary structure formation and protein folding, demonstrating the role of high dimensional amino acid composition space in protein studies.
Boryeu Mao
生物化学生物物理学
Boryeu Mao.Protein folding classes -- High-dimensional geometry of amino acid composition space revisited[EB/OL].(2025-06-02)[2025-06-19].https://arxiv.org/abs/2506.01857.点此复制
评论