首页|Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

来源：

英文摘要

Humans can easily tell if an attribute (also called state) is realistic, i.e., feasible, for an object, e.g. fire can be hot, but it cannot be wet. In Open-World Compositional Zero-Shot Learning, when all possible state-object combinations are considered as unseen classes, zero-shot predictors tend to perform poorly. Our work focuses on using external auxiliary knowledge to determine the feasibility of state-object combinations. Our Feasibility with Language Model (FLM) is a simple and effective approach that leverages Large Language Models (LLMs) to better comprehend the semantic relationships between states and objects. FLM involves querying an LLM about the feasibility of a given pair and retrieving the output logit for the positive answer. To mitigate potential misguidance of the LLM given that many of the state-object compositions are rare or completely infeasible, we observe that the in-context learning ability of LLMs is essential. We present an extensive study identifying Vicuna and ChatGPT as best performing, and we demonstrate that our FLM consistently improves OW-CZSL performance across all three benchmarks.

作者：Jae Myung Kim、Stephan Alaniz、Cordelia Schmid、Zeynep Akata

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Jae Myung Kim,Stephan Alaniz,Cordelia Schmid,Zeynep Akata.Feasibility with Language Models for Open-World Compositional Zero-Shot Learning[EB/OL].(2025-05-16)[2025-06-21].https://arxiv.org/abs/2505.11181.点此复制

Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

Feasibility with Language Models for Open-World Compositional Zero-Shot Learning

评论