|国家预印本平台
首页|Few-step Adversarial Schr\"{o}dinger Bridge for Generative Speech Enhancement

Few-step Adversarial Schr\"{o}dinger Bridge for Generative Speech Enhancement

Few-step Adversarial Schr\"{o}dinger Bridge for Generative Speech Enhancement

来源:Arxiv_logoArxiv
英文摘要

Deep generative models have recently been employed for speech enhancement to generate perceptually valid clean speech on large-scale datasets. Several diffusion models have been proposed, and more recently, a tractable Schr\"odinger Bridge has been introduced to transport between the clean and noisy speech distributions. However, these models often suffer from an iterative reverse process and require a large number of sampling steps -- more than 50. Our investigation reveals that the performance of baseline models significantly degrades when the number of sampling steps is reduced, particularly under low-SNR conditions. We propose integrating Schr\"odinger Bridge with GANs to effectively mitigate this issue, achieving high-quality outputs on full-band datasets while substantially reducing the required sampling steps. Experimental results demonstrate that our proposed model outperforms existing baselines, even with a single inference step, in both denoising and dereverberation tasks.

Seungu Han、Sungho Lee、Juheon Lee、Kyogu Lee

计算技术、计算机技术

Seungu Han,Sungho Lee,Juheon Lee,Kyogu Lee.Few-step Adversarial Schr\"{o}dinger Bridge for Generative Speech Enhancement[EB/OL].(2025-06-02)[2025-07-16].https://arxiv.org/abs/2506.01460.点此复制

评论