|国家预印本平台
首页|TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving

来源:Arxiv_logoArxiv
英文摘要

In recent years, diffusion model has shown its potential across diverse domains from vision generation to language modeling. Transferring its capabilities to modern autonomous driving systems has also emerged as a promising direction.In this work, we propose TransDiffuser, an encoder-decoder based generative trajectory planning model for end-to-end autonomous driving. The encoded scene information serves as the multi-modal conditional input of the denoising decoder. To tackle the mode collapse dilemma in generating high-quality diverse trajectories, we introduce a simple yet effective multi-modal representation decorrelation optimization mechanism during the training process.TransDiffuser achieves PDMS of 94.85 on the NAVSIM benchmark, surpassing previous state-of-the-art methods without any anchor-based prior trajectories.

Xuefeng Jiang、Yuan Ma、Pengxiang Li、Leimeng Xu、Xin Wen、Kun Zhan、Zhongpu Xia、Peng Jia、XianPeng Lang、Sheng Sun

公路运输工程

Xuefeng Jiang,Yuan Ma,Pengxiang Li,Leimeng Xu,Xin Wen,Kun Zhan,Zhongpu Xia,Peng Jia,XianPeng Lang,Sheng Sun.TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving[EB/OL].(2025-05-14)[2025-06-21].https://arxiv.org/abs/2505.09315.点此复制

评论