首页|Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation

Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation

来源：

英文摘要

Large language models (LLMs) increasingly serve as human-like decision-making agents in social science and applied settings. These LLM-agents are typically assigned human-like characters and placed in real-life contexts. However, how these characters and contexts shape an LLM's behavior remains underexplored. This study proposes and tests methods for probing, quantifying, and modifying an LLM's internal representations in a Dictator Game -- a classic behavioral experiment on fairness and prosocial behavior. We extract ``vectors of variable variations'' (e.g., ``male'' to ``female'') from the LLM's internal state. Manipulating these vectors during the model's inference can substantially alter how those variables relate to the model's decision-making. This approach offers a principled way to study and regulate how social concepts can be encoded and engineered within transformer-based models, with implications for alignment, debiasing, and designing AI agents for social simulations in both academic and commercial applications.

作者：Ji Ma

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Ji Ma.Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation[EB/OL].(2025-04-15)[2025-06-19].https://arxiv.org/abs/2504.11671.点此复制

Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation

Steering Prosocial AI Agents: Computational Basis of LLM's Decision Making in Social Simulation

评论