|国家预印本平台
首页|基于深度学习的音频事件检测

基于深度学习的音频事件检测

udio Event Detection Based on Deep Learning

中文摘要英文摘要

神经网络方法在音频事件检测及标记任务中被广泛采用,国际权威声学场景和事件检测及分类竞赛 (Detection and Classification of Acoustic Scenes and Events, DCASE) 中大多数系统都采用时域音频信号或者音频的log-mel谱图作为输入,并取得了优秀的效果。本文介绍了2D-Wave和2D-Wave-LogMel系统,基于神经网络强大的学习能力,将时域信号作为输入并学习出相应的频域表示,再结合log-mel谱图获得更为丰富的音频信号表示作为输入,在FSD50K数据集上取得了优于基线系统的效果。

Neural networks are widely used in audio event detection and tagging tasks. In the detection and classification of acoustic scenes and events (dcase), most systems use time-domain audio signal or log Mel spectrum of audio as input, and achieved excellent results. In this paper, we use the 2D wave-50mel network as the input signal, and use it as the input signal to represent the learning effect of the system.

刘刚、洪晓锋

电子技术应用

音频事件检测神经网络SEFSD50K

audio event detectionNeural networksDCASEFSD50K

刘刚,洪晓锋.基于深度学习的音频事件检测[EB/OL].(2020-12-30)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/202012-121.点此复制

评论