Vision Graph Prompting via Semantic Low-Rank Decomposition
Vision Graph Prompting via Semantic Low-Rank Decomposition
Vision GNN (ViG) demonstrates superior performance by representing images as graph structures, providing a more natural way to capture irregular semantic patterns beyond traditional grid or sequence-based representations. To efficiently adapt ViG to downstream tasks, parameter-efficient fine-tuning techniques like visual prompting become increasingly essential. However, existing prompting methods are primarily designed for Transformer-based models, neglecting the rich topological relationships among nodes and edges in graph-based representations, limiting their capacity to model complex semantics. In this paper, we propose Vision Graph Prompting (VGP), a novel framework tailored for vision graph structures. Our core insight reveals that semantically connected components in the graph exhibit low-rank properties. Building on this observation, we introduce a semantic low-rank prompting method that decomposes low-rank semantic features and integrates them with prompts on vision graph topologies, capturing both global structural patterns and fine-grained semantic dependencies. Extensive experiments demonstrate our method significantly improves ViG's transfer performance on diverse downstream tasks, achieving results comparable to full fine-tuning while maintaining parameter efficiency. Our code is available at https://github.com/zhoujiahuan1991/ICML2025-VGP.
Zixiang Ai、Zichen Liu、Jiahuan Zhou
计算技术、计算机技术
Zixiang Ai,Zichen Liu,Jiahuan Zhou.Vision Graph Prompting via Semantic Low-Rank Decomposition[EB/OL].(2025-05-07)[2025-06-06].https://arxiv.org/abs/2505.04121.点此复制
评论