首页|SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

来源：

英文摘要

Speech pre-processing techniques such as denoising, de-reverberation, and separation, are commonly employed as front-ends for various downstream speech processing tasks. However, these methods can sometimes be inadequate, resulting in residual noise or the introduction of new artifacts. Such deficiencies are typically not captured by metrics like SI-SNR but are noticeable to human listeners. To address this, we introduce SpeechRefiner, a post-processing tool that utilizes Conditional Flow Matching (CFM) to improve the perceptual quality of speech. In this study, we benchmark SpeechRefiner against recent task-specific refinement methods and evaluate its performance within our internal processing pipeline, which integrates multiple front-end algorithms. Experiments show that SpeechRefiner exhibits strong generalization across diverse impairment sources, significantly enhancing speech perceptual quality. Audio demos can be found at https://speechrefiner.github.io/SpeechRefiner/.

作者：Sirui Li、Shuai Wang、Zhijun Liu、Zhongjie Jiang、Yannan Wang、Haizhou Li

作者单位：

学科分类：通信

推荐引用：Sirui Li,Shuai Wang,Zhijun Liu,Zhongjie Jiang,Yannan Wang,Haizhou Li.SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms[EB/OL].(2025-06-16)[2025-07-16].https://arxiv.org/abs/2506.13709.点此复制

SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms

评论