|国家预印本平台
首页|Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge

来源:Arxiv_logoArxiv
英文摘要

This paper presents the system developed to address the MISP 2025 Challenge. For the diarization system, we proposed a hybrid approach combining a WavLM end-to-end segmentation method with a traditional multi-module clustering technique to adaptively select the appropriate model for handling varying degrees of overlapping speech. For the automatic speech recognition (ASR) system, we proposed an ASR-aware observation addition method that compensates for the performance limitations of Guided Source Separation (GSS) under low signal-to-noise ratio conditions. Finally, we integrated the speaker diarization and ASR systems in a cascaded architecture to address Track 3. Our system achieved character error rates (CER) of 9.48% on Track 2 and concatenated minimum permutation character error rate (cpCER) of 11.56% on Track 3, ultimately securing first place in both tracks and thereby demonstrating the effectiveness of the proposed methods in real-world meeting scenarios.

Shangkun Huang、Yuxuan Du、Jingwen Yang、Dejun Zhang、Xupeng Jia、Jing Deng、Jintao Kang、Rong Zheng

通信计算技术、计算机技术

Shangkun Huang,Yuxuan Du,Jingwen Yang,Dejun Zhang,Xupeng Jia,Jing Deng,Jintao Kang,Rong Zheng.Overlap-Adaptive Hybrid Speaker Diarization and ASR-Aware Observation Addition for MISP 2025 Challenge[EB/OL].(2025-05-28)[2025-06-07].https://arxiv.org/abs/2505.22013.点此复制

评论