|国家预印本平台
首页|Formula-Supervised Sound Event Detection: Pre-Training Without Real Data

Formula-Supervised Sound Event Detection: Pre-Training Without Real Data

Formula-Supervised Sound Event Detection: Pre-Training Without Real Data

来源:Arxiv_logoArxiv
英文摘要

In this paper, we propose a novel formula-driven supervised learning (FDSL) framework for pre-training an environmental sound analysis model by leveraging acoustic signals parametrically synthesized through formula-driven methods. Specifically, we outline detailed procedures and evaluate their effectiveness for sound event detection (SED). The SED task, which involves estimating the types and timings of sound events, is particularly challenged by the difficulty of acquiring a sufficient quantity of accurately labeled training data. Moreover, it is well known that manually annotated labels often contain noises and are significantly influenced by the subjective judgment of annotators. To address these challenges, we propose a novel pre-training method that utilizes a synthetic dataset, Formula-SED, where acoustic data are generated solely based on mathematical formulas. The proposed method enables large-scale pre-training by using the synthesis parameters applied at each time step as ground truth labels, thereby eliminating label noise and bias. We demonstrate that large-scale pre-training with Formula-SED significantly enhances model accuracy and accelerates training, as evidenced by our results in the DESED dataset used for DCASE2023 Challenge Task 4. The project page is at https://yutoshibata07.github.io/Formula-SED/

Yuto Shibata、Keitaro Tanaka、Yoshiaki Bando、Keisuke Imoto、Hirokatsu Kataoka、Yoshimitsu Aoki

计算技术、计算机技术自动化技术、自动化技术设备

Yuto Shibata,Keitaro Tanaka,Yoshiaki Bando,Keisuke Imoto,Hirokatsu Kataoka,Yoshimitsu Aoki.Formula-Supervised Sound Event Detection: Pre-Training Without Real Data[EB/OL].(2025-04-06)[2025-04-24].https://arxiv.org/abs/2504.04428.点此复制

评论