|国家预印本平台
首页|Combolutional Neural Networks

Combolutional Neural Networks

Combolutional Neural Networks

来源:Arxiv_logoArxiv
英文摘要

Selecting appropriate inductive biases is an essential step in the design of machine learning models, especially when working with audio, where even short clips may contain millions of samples. To this end, we propose the combolutional layer: a learned-delay IIR comb filter and fused envelope detector, which extracts harmonic features in the time domain. We demonstrate the efficacy of the combolutional layer on three information retrieval tasks, evaluate its computational cost relative to other audio frontends, and provide efficient implementations for training. We find that the combolutional layer is an effective replacement for convolutional layers in audio tasks where precise harmonic analysis is important, e.g., piano transcription, speaker classification, and key detection. Additionally, the combolutional layer has several other key benefits over existing frontends, namely: low parameter count, efficient CPU inference, strictly real-valued computations, and improved interpretability.

Cameron Churchwell、Minje Kim、Paris Smaragdis

计算技术、计算机技术

Cameron Churchwell,Minje Kim,Paris Smaragdis.Combolutional Neural Networks[EB/OL].(2025-07-28)[2025-08-11].https://arxiv.org/abs/2507.21202.点此复制

评论