Identifying Algorithmic and Domain-Specific Bias in Parliamentary Debate Summarisation
Identifying Algorithmic and Domain-Specific Bias in Parliamentary Debate Summarisation
The automated summarisation of parliamentary debates using large language models (LLMs) offers a promising way to make complex legislative discourse more accessible to the public. However, such summaries must not only be accurate and concise but also equitably represent the views and contributions of all speakers. This paper explores the use of LLMs to summarise plenary debates from the European Parliament and investigates the algorithmic and representational biases that emerge in this context. We propose a structured, multi-stage summarisation framework that improves textual coherence and content fidelity, while enabling the systematic analysis of how speaker attributes -- such as speaking order or political affiliation -- influence the visibility and accuracy of their contributions in the final summaries. Through our experiments using both proprietary and open-weight LLMs, we find evidence of consistent positional and partisan biases, with certain speakers systematically under-represented or misattributed. Our analysis shows that these biases vary by model and summarisation strategy, with hierarchical approaches offering the greatest potential to reduce disparity. These findings underscore the need for domain-sensitive evaluation metrics and ethical oversight in the deployment of LLMs for democratic applications.
Eoghan Cunningham、James Cross、Derek Greene
计算技术、计算机技术
Eoghan Cunningham,James Cross,Derek Greene.Identifying Algorithmic and Domain-Specific Bias in Parliamentary Debate Summarisation[EB/OL].(2025-07-16)[2025-08-18].https://arxiv.org/abs/2507.14221.点此复制
评论