|国家预印本平台
首页|A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement

A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement

A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement

来源:Arxiv_logoArxiv
英文摘要

This paper proposes a model that integrates sub-band processing and deep filtering to fully exploit information from the target time-frequency (TF) bin and its surrounding TF bins for single-channel speech enhancement. The sub-band module captures surrounding frequency bin information at the input, while the deep filtering module applies filtering at the output to both the target TF bin and its surrounding TF bins. To further improve the model performance, we decouple deep filtering into temporal and frequency components and introduce a two-stage framework, reducing the complexity of filter coefficient prediction at each stage. Additionally, we propose the TAConv module to strengthen convolutional feature extraction. Experimental results demonstrate that the proposed hierarchical deep filtering network (HDF-Net) effectively utilizes surrounding TF bin information and outperforms other advanced systems while using fewer resources.

Shenghui Lu、Hukai Huang、Jinanglong Yao、Kaidi Wang、Qingyang Hong、Lin Li

通信无线通信

Shenghui Lu,Hukai Huang,Jinanglong Yao,Kaidi Wang,Qingyang Hong,Lin Li.A Two-Stage Hierarchical Deep Filtering Framework for Real-Time Speech Enhancement[EB/OL].(2025-06-01)[2025-06-23].https://arxiv.org/abs/2506.01023.点此复制

评论