基于深度学习的音频事件检测
udio Event Detection Based on Deep Learning
神经网络方法在音频事件检测及标记任务中被广泛采用,国际权威声学场景和事件检测及分类竞赛 (Detection and Classification of Acoustic Scenes and Events, DCASE) 中大多数系统都采用时域音频信号或者音频的log-mel谱图作为输入,并取得了优秀的效果。本文介绍了2D-Wave和2D-Wave-LogMel系统,基于神经网络强大的学习能力,将时域信号作为输入并学习出相应的频域表示,再结合log-mel谱图获得更为丰富的音频信号表示作为输入,在FSD50K数据集上取得了优于基线系统的效果。
Neural networks are widely used in audio event detection and tagging tasks. In the detection and classification of acoustic scenes and events (dcase), most systems use time-domain audio signal or log Mel spectrum of audio as input, and achieved excellent results. In this paper, we use the 2D wave-50mel network as the input signal, and use it as the input signal to represent the learning effect of the system.
刘刚、洪晓锋
电子技术应用
音频事件检测神经网络SEFSD50K
audio event detectionNeural networksDCASEFSD50K
刘刚,洪晓锋.基于深度学习的音频事件检测[EB/OL].(2020-12-30)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/202012-121.点此复制
评论