首页|Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

来源：

英文摘要

While full-duplex speech agents promise natural, low-latency human--machine interaction by concurrently processing input and output speech, overlap management remains under-evaluated. We introduce Full-Duplex-Bench v1.5, a modular, fully automated benchmark that simulates four overlap scenarios: user interruption, listener backchannel, side conversation, and ambient speech. Our framework supports both open-sourced and commercial models, offering a comprehensive, extensible metric suite -- categorical dialogue behaviors, stop and response latency, prosodic adaptation, and perceived speech quality -- that can be tailored to application-specific criteria. Benchmarking five state-of-the-art agents reveals two principal strategies: repair-first rapid yielding versus continuity-first sustained flow, and highlights scenario-dependent performance trends. The open-sourced design enables seamless extension with new audio assets, languages, and deployment contexts, empowering practitioners to customize and accelerate the evaluation of robust full-duplex speech systems.

作者：Guan-Ting Lin、Shih-Yun Shan Kuan、Qirui Wang、Jiachen Lian、Tingle Li、Hung-yi Lee

作者单位：

学科分类：通信无线通信计算技术、计算机技术

推荐引用：Guan-Ting Lin,Shih-Yun Shan Kuan,Qirui Wang,Jiachen Lian,Tingle Li,Hung-yi Lee.Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models[EB/OL].(2025-07-30)[2025-08-07].https://arxiv.org/abs/2507.23159.点此复制

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

评论