|国家预印本平台
首页|A Survey of Deep Learning for Complex Speech Spectrograms

A Survey of Deep Learning for Complex Speech Spectrograms

A Survey of Deep Learning for Complex Speech Spectrograms

来源:Arxiv_logoArxiv
英文摘要

Recent advancements in deep learning have significantly impacted the field of speech signal processing, particularly in the analysis and manipulation of complex spectrograms. This survey provides a comprehensive overview of the state-of-the-art techniques leveraging deep neural networks for processing complex spectrograms, which encapsulate both magnitude and phase information. We begin by introducing complex spectrograms and their associated features for various speech processing tasks. Next, we explore the key components and architectures of complex-valued neural networks, which are specifically designed to handle complex-valued data and have been applied for complex spectrogram processing. We then discuss various training strategies and loss functions tailored for training neural networks to process and model complex spectrograms. The survey further examines key applications, including phase retrieval, speech enhancement, and speech separation, where deep learning has achieved significant progress by leveraging complex spectrograms or their derived feature representations. Additionally, we examine the intersection of complex spectrograms with generative models. This survey aims to serve as a valuable resource for researchers and practitioners in the field of speech signal processing and complex-valued neural networks.

Yuying Xie、Zheng-Hua Tan

计算技术、计算机技术

Yuying Xie,Zheng-Hua Tan.A Survey of Deep Learning for Complex Speech Spectrograms[EB/OL].(2025-05-13)[2025-06-16].https://arxiv.org/abs/2505.08694.点此复制

评论