首页|Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

来源：

英文摘要

There is a growing interest in assessing the personality traits of Large language models (LLMs). However, traditional personality assessments based on self-report questionnaires may fail to capture their true behavioral nuances due to inherent biases and meta-knowledge contamination. This paper introduces a novel multi-observer framework for LLM personality assessment that draws inspiration from informant-report methods in psychology. Instead of relying solely on self-assessments, our approach employs multiple observer agents configured with a specific relationship context (e.g., family, friend, or workplace) to simulate interactive scenarios with a subject LLM. These observers engage in dialogues and subsequently provide ratings across the Big Five personality dimensions. Our experiments reveal that LLMs possess systematic biases in self-report personality ratings. Moreover, aggregating observer ratings effectively reduces non-systematic biases and achieves optimal reliability with 5-7 observers. The findings highlight the significant impact of relationship context on personality perception and demonstrate that a multi-observer paradigm yields a more robust and context-sensitive evaluation of LLM personality traits.

作者：Yin Jou Huang、Rafik Hadfi

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Yin Jou Huang,Rafik Hadfi.Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models[EB/OL].(2025-04-11)[2025-05-01].https://arxiv.org/abs/2504.08399.点此复制

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

Beyond Self-Reports: Multi-Observer Agents for Personality Assessment in Large Language Models

评论