Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs
Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs
In complex and low-data domains such as biomedical research, incorporating background knowledge (BK) graphs, such as protein-protein interaction (PPI) networks, into graph-based machine learning pipelines is a promising research direction. However, while BK is often assumed to improve model performance, its actual contribution and the impact of imperfect knowledge remain poorly understood. In this work, we investigate the role of BK in an important real-world task: cancer subtype classification. Surprisingly, we find that (i) state-of-the-art GNNs using BK perform no better than uninformed models like linear regression, and (ii) their performance remains largely unchanged even when the BK graph is heavily perturbed. To understand these unexpected results, we introduce an evaluation framework, which employs (i) a synthetic setting where the BK is clearly informative and (ii) a set of perturbations that simulate various imperfections in BK graphs. With this, we test the robustness of BK-aware models in both synthetic and real-world biomedical settings. Our findings reveal that careful alignment of GNN architectures and BK characteristics is necessary but holds the potential for significant performance improvements.
Kutalm?? Co?kun、Ivo Kavisanczki、Amin Mirzaei、Tom Siegl、Bjarne C. Hiller、Stefan Lüdtke、Martin Becker
肿瘤学生物科学研究方法、生物科学研究技术
Kutalm?? Co?kun,Ivo Kavisanczki,Amin Mirzaei,Tom Siegl,Bjarne C. Hiller,Stefan Lüdtke,Martin Becker.Informed, but Not Always Improved: Challenging the Benefit of Background Knowledge in GNNs[EB/OL].(2025-05-16)[2025-06-03].https://arxiv.org/abs/2505.11023.点此复制
评论