|国家预印本平台
首页|Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing

Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing

Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing

来源:Arxiv_logoArxiv
英文摘要

Multi-modal affective computing aims to automatically recognize and interpret human attitudes from diverse data sources such as images and text, thereby enhancing human-computer interaction and emotion understanding. Existing approaches typically rely on unimodal analysis or straightforward fusion of cross-modal information that fail to capture complex and conflicting evidence presented across different modalities. In this paper, we propose a novel LLM-based approach for affective computing that explicitly deconstructs visual and textual representations into shared (modality-invariant) and modality-specific components. Specifically, our approach firstly encodes and aligns input modalities using pre-trained multi-modal encoders, then employs a representation decomposition framework to separate common emotional content from unique cues, and finally integrates these decomposed signals via an attention mechanism to form a dynamic soft prompt for a multi-modal LLM. Extensive experiments on three representative tasks for affective computing, namely, multi-modal aspect-based sentiment analysis, multi-modal emotion analysis, and hateful meme detection, demonstrate the effectiveness of our approach, which consistently outperforms strong baselines and state-of-the-art models.

Yuanhe Tian、Pengsen Cheng、Guoqing Jin、Lei Zhang、Yan Song

计算技术、计算机技术

Yuanhe Tian,Pengsen Cheng,Guoqing Jin,Lei Zhang,Yan Song.Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing[EB/OL].(2025-06-08)[2025-06-15].https://arxiv.org/abs/2506.07086.点此复制

评论