|国家预印本平台
首页|Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism

Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism

Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism

来源:Arxiv_logoArxiv
英文摘要

In this study, we focus on heart murmur classification (HMC) and hypothesize that combining neural audio codec representations (NACRs) such as EnCodec with spectral features (SFs), such as MFCC, will yield superior performance. We believe such fusion will trigger their complementary behavior as NACRs excel at capturing fine-grained acoustic patterns such as rhythm changes, spectral features focus on frequency-domain properties such as harmonic structure, spectral energy distribution crucial for analyzing the complex of heart sounds. To this end, we propose, BAOMI, a novel framework banking on novel bandit-based cross-attention mechanism for effective fusion. Here, a agent provides more weightage to most important heads in multi-head cross-attention mechanism and helps in mitigating the noise. With BAOMI, we report the topmost performance in comparison to individual NACRs, SFs, and baseline fusion techniques and setting new state-of-the-art.

Orchid Chetia Phukan、Girish、Mohd Mujtaba Akhtar、Swarup Ranjan Behera、Priyabrata Mallick、Santanu Roy、Arun Balaji Buduru、Rajesh Sharma

医学研究方法基础医学

Orchid Chetia Phukan,Girish,Mohd Mujtaba Akhtar,Swarup Ranjan Behera,Priyabrata Mallick,Santanu Roy,Arun Balaji Buduru,Rajesh Sharma.Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism[EB/OL].(2025-06-01)[2025-06-25].https://arxiv.org/abs/2506.01148.点此复制

评论