首页|TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

来源：

英文摘要

In recent years, diffusion model has shown its potential across diverse domains from vision generation to language modeling. Transferring its capabilities to modern autonomous driving systems has also emerged as a promising direction.In this work, we propose TransDiffuser, an encoder-decoder based generative trajectory planning model for end-to-end autonomous driving. The encoded scene information serves as the multi-modal conditional input of the denoising decoder. To tackle the mode collapse dilemma in generating high-quality diverse trajectories, we introduce a simple yet effective multi-modal representation decorrelation optimization mechanism during the training process.TransDiffuser achieves PDMS of 94.85 on the NAVSIM benchmark, surpassing previous state-of-the-art methods without any anchor-based prior trajectories.

作者：Xuefeng Jiang、Yuan Ma、Pengxiang Li、Leimeng Xu、Xin Wen、Kun Zhan、Zhongpu Xia、Peng Jia、XianPeng Lang、Sheng Sun

作者单位：

学科分类：公路运输工程

推荐引用：Xuefeng Jiang,Yuan Ma,Pengxiang Li,Leimeng Xu,Xin Wen,Kun Zhan,Zhongpu Xia,Peng Jia,XianPeng Lang,Sheng Sun.TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving[EB/OL].(2025-05-14)[2025-06-21].https://arxiv.org/abs/2505.09315.点此复制

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

评论