Learning the sequence code of protein expression in human immune cells
Learning the sequence code of protein expression in human immune cells
Accurate protein expression in human immune cells is essential for appropriate cellular function. The mechanisms that define protein abundance are complex and executed on transcriptional, post-transcriptional and post-translational level. Here, we present SONAR, a machine learning pipeline that learns the endogenous sequence code and that defines protein abundance in human cells. SONAR uses thousands of sequence features (SFs) to predict up to 63% of the protein abundance independently of promoter or enhancer information. SONAR uncovered the cell type-specific and activation-dependent usage of SFs. The deep knowledge of SONAR provides a map of biologically active SFs, which can be leveraged to manipulate the amplitude, timing, and cell type-specificity of protein expression. SONAR informed on the design of enhancer sequences to boost T cell receptor expression and to potentiate T cell function. Beyond providing fundamental insights in the regulation of protein expression, our study thus offers novel means to improve therapeutic and biotechnology applications.
Jurgens Anouk、Bresser Kaspar、Bradaric Antonia、Nicolet Benoit P、Guislain Aurelie、Wolkers Monika
基础医学生物科学研究方法、生物科学研究技术分子生物学
Jurgens Anouk,Bresser Kaspar,Bradaric Antonia,Nicolet Benoit P,Guislain Aurelie,Wolkers Monika.Learning the sequence code of protein expression in human immune cells[EB/OL].(2025-03-28)[2025-06-22].https://www.biorxiv.org/content/10.1101/2023.09.01.555843.点此复制
评论