用于多模态知识图谱补全的结构感知多模态扩散模型
Structure-Aware Multimodal Diffusion Model for Multimodal Knowledge Graph Completion
摘要
多模态知识图谱(MKG)在推荐系统和问答系统中应用广泛,但其完整性仍存在显著缺陷,这催生了多模态知识图谱补全(MKGC)方法的需求。现有多数方法依赖于通过最大化条件似然实现判别建模,却忽视了底层数据分布特性,导致其难以捕捉复杂的现实世界关系。为解决此问题,本文提出结构感知的多模态扩散模型(DiffusionCom)用于MKGC。该模型将MKGC问题表述为:从噪声中生成(主语,关系)对与候选尾部实体的联合概率分布。同时提出结构感知的多模态预训练模型(SAMPT),通过多模态图注意力网络(MGAT)捕获结构信息,并进行自适应融合。DiffusionCom在FB15k-237-IMG和WN18-IMG数据集上超越现有最先进模型。
Abstract
Multimodal knowledge graphs (MKG) find extensive application in recommendation systems, and question-answering systems. However, their integrity remains significantly flawed, necessitating the development of multimodal knowledge graph completion (MKGC) methods. Most existing approaches rely on discriminative modelling via maximum likelihood estimation, yet neglect the underlying data distribution characteristics, rendering them ill-equipped to capture complex real-world relationships. To address this, this paper proposes a structure-aware multimodal diffusion model (DiffusionCom) for MKGC. This model formulates the MKGC problem as generating a joint probability distribution over (subject, relation) pairs and candidate tail entities from noise. Concurrently, we introduce the Structure-Aware Multimodal Pre-training (SAMPT) model, which captures structural information through a Multimodal Graph Attention Network (MGAT) and perform adaptive fusion. DiffusionCom outperforms existing state-of-the-art models on the FB15k-237-IMG and WN18-IMG datasets.关键词
计算机科学与技术/多模态扩散模型/多模态知识图谱Key words
Computer Science and Technology/Diffusion Models/Multimodal Knowledge Graphs引用本文复制引用
黄伟,梁美玉.用于多模态知识图谱补全的结构感知多模态扩散模型[EB/OL].(2026-03-23)[2026-03-25].http://www.paper.edu.cn/releasepaper/content/202603-213.学科分类
计算技术、计算机技术
评论