首页|MOVER: Combining Multiple Meeting Recognition Systems

MOVER: Combining Multiple Meeting Recognition Systems

来源：

英文摘要

In this paper, we propose Meeting recognizer Output Voting Error Reduction (MOVER), a novel system combination method for meeting recognition tasks. Although there are methods to combine the output of diarization (e.g., DOVER) or automatic speech recognition (ASR) systems (e.g., ROVER), MOVER is the first approach that can combine the outputs of meeting recognition systems that differ in terms of both diarization and ASR. MOVER combines hypotheses with different time intervals and speaker labels through a five-stage process that includes speaker alignment, segment grouping, word and timing combination, etc. Experimental results on the CHiME-8 DASR task and the multi-channel track of the NOTSOFAR-1 task demonstrate that MOVER can successfully combine multiple meeting recognition systems with diverse diarization and recognition outputs, achieving relative tcpWER improvements of 9.55 % and 8.51 % over the state-of-the-art systems for both tasks.

作者：Naoyuki Kamo、Tsubasa Ochiai、Marc Delcroix、Tomohiro Nakatani

作者单位：

学科分类：语言学计算技术、计算机技术

推荐引用：Naoyuki Kamo,Tsubasa Ochiai,Marc Delcroix,Tomohiro Nakatani.MOVER: Combining Multiple Meeting Recognition Systems[EB/OL].(2025-08-07)[2025-08-18].https://arxiv.org/abs/2508.05055.点此复制

MOVER: Combining Multiple Meeting Recognition Systems

MOVER: Combining Multiple Meeting Recognition Systems

评论