|国家预印本平台
首页|Aligning biological sequences by exploiting residue conservation and coevolution

Aligning biological sequences by exploiting residue conservation and coevolution

Aligning biological sequences by exploiting residue conservation and coevolution

来源:bioRxiv_logobioRxiv
英文摘要

Aligning biological sequences belongs to the most important problems in computational sequence analysis; it allows for detecting evolutionary relationships between sequences and for predicting biomolecular structure and function. Typically this is addressed through profile models, which capture position-specificities like conservation in sequences, but assume an independent evolution of different positions. RNA sequences are an exception where the coevolution of paired bases in the secondary structure is taken into account. Over the last years, it has been well established that coevolution is essential also in proteins for maintaining three-dimensional structure and function; modeling approaches based on inverse statistical physics can catch the coevolution signal and are now widely used in predicting protein structure, protein-protein interactions, and mutational landscapes. Here, we present DCAlign, an efficient approach based on an approximate message-passing strategy, which is able to overcome the limitations of profile models, to include general second-order interactions among positions and to be therefore universally applicable to protein- and RNA-sequence alignment. The potential of our algorithm is carefully explored using well-controlled simulated data, as well as real protein and RNA sequences.

Pagnani Andrea、Zamponi Francesco、Muntoni Anna Paola、Weigt Martin

Department of Applied Science and Technology (DISAT), Politecnico di Torino||Italian Institute for Genomic Medicine, IRCCS Candiolo||INFN, Sezione di TorinoLaboratoire de Physique de l?ˉEcole Normale Sup¨|rieure, ENS, Universit¨| PSL, CNRS, Sorbonne Universit¨|, Universit¨| de ParisDepartment of Applied Science and Technology (DISAT), Politecnico di Torino||Laboratoire de Physique de l?ˉEcole Normale Sup¨|rieure, ENS, Universit¨| PSL, CNRS, Sorbonne Universit¨|, Universit¨| de Paris||Sorbonne Universit¨|, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQBSorbonne Universit¨|, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB

10.1101/2020.05.18.101295

分子生物学生物物理学生物科学研究方法、生物科学研究技术

Pagnani Andrea,Zamponi Francesco,Muntoni Anna Paola,Weigt Martin.Aligning biological sequences by exploiting residue conservation and coevolution[EB/OL].(2025-03-28)[2025-04-27].https://www.biorxiv.org/content/10.1101/2020.05.18.101295.点此复制

评论