|国家预印本平台
首页|Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors

Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors

Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors

来源:Arxiv_logoArxiv
英文摘要

Despite their linguistic prowess, LLMs have been shown to be vulnerable to small input perturbations. While robustness to local adversarial changes has been studied, robustness to global modifications such as different linguistic styles remains underexplored. Therefore, we take a broader approach to explore a wider range of variations across sociodemographic dimensions. We extend the SocialIQA dataset to create diverse paraphrased sets conditioned on sociodemographic factors (age and gender). The assessment aims to provide a deeper understanding of LLMs in (a) their capability of generating demographic paraphrases with engineered prompts and (b) their capabilities in interpreting real-world, complex language scenarios. We also perform a reliability analysis of the generated paraphrases looking into linguistic diversity and perplexity as well as manual evaluation. We find that demographic-based paraphrasing significantly impacts the performance of language models, indicating that the subtleties of linguistic variation remain a significant challenge. We will make the code and dataset available for future research.

Pulkit Arora、Akbar Karimi、Lucie Flek

语言学常用外国语

Pulkit Arora,Akbar Karimi,Lucie Flek.Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors[EB/OL].(2025-07-04)[2025-07-18].https://arxiv.org/abs/2501.08276.点此复制

评论