|国家预印本平台
首页|Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations

Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations

Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations

来源:Arxiv_logoArxiv
英文摘要

Understanding the binding specificity between T-cell receptors (TCRs) and peptide-major histocompatibility complexes (pMHCs) is central to immunotherapy and vaccine development. However, current predictive models struggle with generalization, especially in data-scarce settings and when faced with novel epitopes. We present LANTERN (Large lAnguage model-powered TCR-Enhanced Recognition Network), a deep learning framework that combines large-scale protein language models with chemical representations of peptides. By encoding TCR \b{eta}-chain sequences using ESM-1b and transforming peptide sequences into SMILES strings processed by MolFormer, LANTERN captures rich biological and chemical features critical for TCR-peptide recognition. Through extensive benchmarking against existing models such as ChemBERTa, TITAN, and NetTCR, LANTERN demonstrates superior performance, particularly in zero-shot and few-shot learning scenarios. Our model also benefits from a robust negative sampling strategy and shows significant clustering improvements via embedding analysis. These results highlight the potential of LANTERN to advance TCR-pMHC binding prediction and support the development of personalized immunotherapies.

Cong Qi、Hanzhang Fang、Siqi jiang、Tianxing Hu、Wei Zhi

生物科学研究方法、生物科学研究技术分子生物学

Cong Qi,Hanzhang Fang,Siqi jiang,Tianxing Hu,Wei Zhi.Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations[EB/OL].(2025-04-22)[2025-07-16].https://arxiv.org/abs/2505.01433.点此复制

评论