|国家预印本平台
| 注册
首页|面向困难样本的异构图对比学习聚类

面向困难样本的异构图对比学习聚类

高占儒 李丹 袁凯 王新月 商潮 王燕

面向困难样本的异构图对比学习聚类

Hard Sample-Aware Contrastive Clustering on Heterogeneous Graphs

高占儒 1李丹 1袁凯 1王新月 1商潮 1王燕1

作者信息

  • 1. 烟台大学数学与信息科学学院,烟台 264005
  • 折叠

摘要

\justifying 近年来, 异构图神经网络在建模包含多类型节点与多类型边的复杂数据方面取得了显著进展. 尤其是基于对比学习的方法, 通过构造正样本对与负样本对, 有效提升了节点表示的判别能力, 同时, 其无监督特性使其在实际应用中具备更强的灵活性与适应性. 然而, 现有方法往往忽视了对比学习中困难样本的潜在价值, 未能充分挖掘异构图中蕴含的结构信息与语义信息, 导致模型性能受限. 为此, 本文在异构图对比学习框架中引入困难样本挖掘机制, 以提升模型对复杂样本的判别能力. 具体而言, 我们基于相似度矩阵与聚类生成的伪标签, 设计了一种自适应加权矩阵, 对困难样本对赋予更高的训练权重, 同时降低易样本对的影响, 从而引导模型在训练过程中更加关注具有挑战性的样本. 此外, 不同于以往主要侧重于利用图结构信息并优化负样本的HGNN方法, 本文模型融合了局部邻域信息与高阶元路径策略来生成伪标签, 从而能够同时捕捉局部与全局结构特征. 进一步地, 本文不仅关注困难负样本对, 还同时挖掘困难正样本对, 从多个角度提升异构图表示学习效果. 在多个真实数据集上的大量实验结果表明, 本文方法在异构图聚类任务中持续优于现有最先进方法, 验证了困难样本感知对比学习在异构图场景中的有效性与广泛适用性.

Abstract

\justifying In recent years, Heterogeneous Graph Neural Networks (HGNNs) have achieved remarkable progress in modeling complex data involving multiple types of nodes and edges. In particular, contrastive learning-based approaches enhance the discriminative power of node representations by constructing positive and negative sample pairs. Moreover, their unsupervised characteristic offers greater flexibility and adaptability for real-world applications. However, existing methods often overlook the potential value of hard samples in contrastive learning, resulting in a suboptimal exploitation of the structural and semantic information embedded within heterogeneous graphs. To address this limitation, this paper incorporates a hard sample mining mechanism into contrastive heterogeneous graph learning, with the goal of enhancing the model's capacity for discriminating among difficult instances. Specifically, we design an adaptive weighting matrix based on similarity matrices and clustering-derived pseudo-labels, which increases the training weights of hard sample pairs while down-weighting easy pairs, thereby directing the model's attention to challenging cases throughout training. Furthermore, unlike previous HGNN methods that primarily focus on leveraging graph structure and optimizing negative samples, our model integrates both local neighborhood information and high-order meta-path strategies to generate pseudo-labels, effectively capturing both local and global structures. Additionally, our approach identifies not only hard negative pairs but also hard positive pairs, further enhancing representation learning on heterogeneous graphs. Extensive experiments on real-world datasets demonstrate that our method consistently outperforms existing state-of-the-art models in heterogeneous graph clustering tasks, validating the effectiveness and broad applicability of hard sample-aware contrastive learning in heterogeneous graph scenarios.

关键词

异构图/对比学习/困难样本/聚类

Key words

Heterogeneous graph/Contrastive learning/Hard sample/Clustering

引用本文复制引用

高占儒,李丹,袁凯,王新月,商潮,王燕.面向困难样本的异构图对比学习聚类[EB/OL].(2026-03-31)[2026-04-03].http://www.paper.edu.cn/releasepaper/content/202603-304.

学科分类

计算技术、计算机技术

评论

首发时间 2026-03-31
下载量:0
|
点击量:8
段落导航相关论文