AdS: Adapter-state Sharing Framework for Multimodal Sarcasm Detection
AdS: Adapter-state Sharing Framework for Multimodal Sarcasm Detection
The growing prevalence of multimodal image-text sarcasm on social media poses challenges for opinion mining, especially under resource constraints. Existing approaches rely on full fine-tuning of large pre-trained models, making them unsuitable for low-resource settings. While recent parameter-efficient fine-tuning (PEFT) methods offer promise, their off-the-shelf use underperforms on complex tasks like sarcasm detection. We propose AdS (Adapter-State Sharing), a lightweight framework built on CLIP that inserts adapters only in the upper layers and introduces a novel adapter-state sharing mechanism, where textual adapters guide visual ones. This design promotes efficient cross-modal learning while preserving low-level unimodal representations. Experiments on two public benchmarks demonstrate that AdS achieves state-of-the-art results using significantly fewer trainable parameters than existing PEFT and full fine-tuning approaches.
Soumyadeep Jana、Sahil Danayak、Sanasam Ranbir Singh
计算技术、计算机技术
Soumyadeep Jana,Sahil Danayak,Sanasam Ranbir Singh.AdS: Adapter-state Sharing Framework for Multimodal Sarcasm Detection[EB/OL].(2025-07-06)[2025-07-23].https://arxiv.org/abs/2507.04508.点此复制
评论