Gated Multimodal Graph Learning for Personalized Recommendation
Gated Multimodal Graph Learning for Personalized Recommendation
Multimodal recommendation has emerged as a promising solution to alleviate the cold-start and sparsity problems in collaborative filtering by incorporating rich content information, such as product images and textual descriptions. However, effectively integrating heterogeneous modalities into a unified recommendation framework remains a challenge. Existing approaches often rely on fixed fusion strategies or complex architectures , which may fail to adapt to modality quality variance or introduce unnecessary computational overhead. In this work, we propose RLMultimodalRec, a lightweight and modular recommendation framework that combines graph-based user modeling with adaptive multimodal item encoding. The model employs a gated fusion module to dynamically balance the contribution of visual and textual modalities, enabling fine-grained and content-aware item representations. Meanwhile, a two-layer LightGCN encoder captures high-order collaborative signals by propagating embeddings over the user-item interaction graph without relying on nonlinear transformations. We evaluate our model on a real-world dataset from the Amazon product domain. Experimental results demonstrate that RLMultimodalRec consistently outperforms several competitive baselines, including collaborative filtering, visual-aware, and multimodal GNN-based methods. The proposed approach achieves significant improvements in top-K recommendation metrics while maintaining scalability and interpretability, making it suitable for practical deployment.
Sibei Liu、Yuanzhe Zhang、Xiang Li、Yunbo Liu、Chengwei Feng、Hao Yang
计算技术、计算机技术
Sibei Liu,Yuanzhe Zhang,Xiang Li,Yunbo Liu,Chengwei Feng,Hao Yang.Gated Multimodal Graph Learning for Personalized Recommendation[EB/OL].(2025-05-30)[2025-06-20].https://arxiv.org/abs/2506.00107.点此复制
评论