首页|Visual Instance-aware Prompt Tuning

Visual Instance-aware Prompt Tuning

来源：

英文摘要

Visual Prompt Tuning (VPT) has emerged as a parameter-efficient fine-tuning paradigm for vision transformers, with conventional approaches utilizing dataset-level prompts that remain the same across all input instances. We observe that this strategy results in sub-optimal performance due to high variance in downstream datasets. To address this challenge, we propose Visual Instance-aware Prompt Tuning (ViaPT), which generates instance-aware prompts based on each individual input and fuses them with dataset-level prompts, leveraging Principal Component Analysis (PCA) to retain important prompting information. Moreover, we reveal that VPT-Deep and VPT-Shallow represent two corner cases based on a conceptual understanding, in which they fail to effectively capture instance-specific information, while random dimension reduction on prompts only yields performance between the two extremes. Instead, ViaPT overcomes these limitations by balancing dataset-level and instance-level knowledge, while reducing the amount of learnable parameters compared to VPT-Deep. Extensive experiments across 34 diverse datasets demonstrate that our method consistently outperforms state-of-the-art baselines, establishing a new paradigm for analyzing and optimizing visual prompts for vision transformers.

作者：Xi Xiao、Yunbei Zhang、Xingjian Li、Tianyang Wang、Xiao Wang、Yuxiang Wei、Jihun Hamm、Min Xu

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Xi Xiao,Yunbei Zhang,Xingjian Li,Tianyang Wang,Xiao Wang,Yuxiang Wei,Jihun Hamm,Min Xu.Visual Instance-aware Prompt Tuning[EB/OL].(2025-07-10)[2025-07-25].https://arxiv.org/abs/2507.07796.点此复制

Visual Instance-aware Prompt Tuning

Visual Instance-aware Prompt Tuning

评论