|国家预印本平台
| 注册
首页|基于层级注意力的多模态融合推荐算法

基于层级注意力的多模态融合推荐算法

段博涵 孟祥武

基于层级注意力的多模态融合推荐算法

Hierarchical Attention-Based Multimodal Fusion Recommendation Algorithm

段博涵 1孟祥武1

作者信息

  • 1. 智能通信软件与多媒体北京市重点实验室(北京邮电大学),北京 100876
  • 折叠

摘要

推荐系统正面临数据高度异质、稀疏性极大的挑战,传统基于物品ID的协同过滤模型在冷启动和稀疏交互上表现不佳,多模态推荐通过利用图像、文本等模态融合来缓解这些问题。然而,现有多模态方法常受语义错位、噪声干扰以及无法有效融合不同模态特征的问题困扰。本文提出一种基于层级注意力的多模态融合推荐模型(HMA-Rec),旨在提升多模态推荐系统的准确性与鲁棒性。该模型采用包含三层结构的层级注意力机制,分别为模态内自注意力、跨模态交互注意力和门控聚合。实验结果表明,HMA-Rec 在准确性和鲁棒性方面均优于当前主流的多模态推荐模型,该模型的层级注意力机制能够实现模态间细粒度的特征对齐,尤其在数据含噪声或稀疏的场景中表现更优。

Abstract

Recommender systems are currently facing significant challenges due to highly heterogeneous and extremely sparse data. Traditional collaborative filtering models based on item IDs often underperform in cold-start and sparse-interaction scenarios. Multimodal recommendation alleviates these issues by leveraging the fusion of modalities such as images and text. However, existing multimodal approaches frequently suffer from semantic misalignment, noise interference, and ineffective integration of features from different modalities. This paper proposes a Hierarchical Attention-based Multimodal Fusion Recommendation Model (HMA-Rec) to enhance the accuracy and robustness of multimodal recommender systems. The model adopts a hierarchical attention mechanism comprising three layers: intra-modal self-attention, cross-modal interaction attention, and gated aggregation. Experimental results demonstrate that HMA-Rec outperforms current mainstream multimodal recommendation models in both accuracy and robustness. The hierarchical attention mechanism effectively achieves fine-grained feature alignment across modalities, particularly excelling in scenarios with noisy or sparse data.

关键词

多模态推荐/跨模态对齐/多模态融合/门控聚合

Key words

Multimodal recommendation/Cross-modal alignment/Multimodal fusion/Gate aggregation

引用本文复制引用

段博涵,孟祥武.基于层级注意力的多模态融合推荐算法[EB/OL].(2026-03-02)[2026-03-04].http://www.paper.edu.cn/releasepaper/content/202603-18.

学科分类

计算技术、计算机技术

评论

首发时间 2026-03-02
下载量:0
|
点击量:14
段落导航相关论文