Graph Synthetic Out-of-Distribution Exposure with Large Language Models
Graph Synthetic Out-of-Distribution Exposure with Large Language Models
Out-of-distribution (OOD) detection in graphs is critical for ensuring model robustness in open-world and safety-sensitive applications. Existing graph OOD detection approaches typically train an in-distribution (ID) classifier on ID data alone, then apply post-hoc scoring to detect OOD instances. While OOD exposure - adding auxiliary OOD samples during training - can improve detection, current graph-based methods often assume access to real OOD nodes, which is often impractical or costly. In this paper, we present GOE-LLM, a framework that leverages Large Language Models (LLMs) to achieve OOD exposure on text-attributed graphs without using any real OOD nodes. GOE-LLM introduces two pipelines: (1) identifying pseudo-OOD nodes from the initially unlabeled graph using zero-shot LLM annotations, and (2) generating semantically informative synthetic OOD nodes via LLM-prompted text generation. These pseudo-OOD nodes are then used to regularize ID classifier training and enhance OOD detection awareness. Empirical results on multiple benchmarks show that GOE-LLM substantially outperforms state-of-the-art methods without OOD exposure, achieving up to a 23.5% improvement in AUROC for OOD detection, and attains performance on par with those relying on real OOD labels for exposure.
Haoyan Xu、Zhengtao Yao、Ziyi Wang、Zhan Cheng、Xiyang Hu、Mengyuan Li、Yue Zhao
计算技术、计算机技术
Haoyan Xu,Zhengtao Yao,Ziyi Wang,Zhan Cheng,Xiyang Hu,Mengyuan Li,Yue Zhao.Graph Synthetic Out-of-Distribution Exposure with Large Language Models[EB/OL].(2025-04-29)[2025-05-24].https://arxiv.org/abs/2504.21198.点此复制
评论