An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement
An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement
We propose a novel iterative phase estimation framework, termed multi-source Griffin-Lim algorithm (MSGLA), for speech enhancement (SE) under additive noise conditions. The core idea is to leverage the ad-hoc consistency constraint of complex-valued short-time Fourier transform (STFT) spectrograms to address the sign ambiguity challenge commonly encountered in geometry-based phase estimation. Furthermore, we introduce a variant of the geometric constraint framework based on the law of sines and cosines, formulating a new phase reconstruction algorithm using noise phase estimates. We first validate the proposed technique through a series of oracle experiments, demonstrating its effectiveness under ideal conditions. We then evaluate its performance on the VB-DMD and WSJ0-CHiME3 data sets, and show that the proposed MSGLA variants match well or slightly outperform existing algorithms, including direct phase estimation and DNN-based sign prediction, especially in terms of background noise suppression.
Chun-Wei Ho、Pin-Jui Ku、Hao Yen、Sabato Marco Siniscalchi、Yu Tsao、Chin-Hui Lee
通信无线通信
Chun-Wei Ho,Pin-Jui Ku,Hao Yen,Sabato Marco Siniscalchi,Yu Tsao,Chin-Hui Lee.An Investigation on Combining Geometry and Consistency Constraints into Phase Estimation for Speech Enhancement[EB/OL].(2025-07-02)[2025-07-16].https://arxiv.org/abs/2507.02192.点此复制
评论