首页|Detecting value-expressive text posts in Russian social media

Detecting value-expressive text posts in Russian social media

来源：

英文摘要

Basic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several classification models using embeddings from various pre-trained transformer-based language models. The best performance was achieved with embeddings from a fine-tuned rubert-tiny2 model, yielding high value detection quality (F1 = 0.75, F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.

作者：Maria Milkova、Maksim Rudnev、Lidia Okolskaya

作者单位：

学科分类：印欧语系

推荐引用：Maria Milkova,Maksim Rudnev,Lidia Okolskaya.Detecting value-expressive text posts in Russian social media[EB/OL].(2025-07-08)[2025-07-20].https://arxiv.org/abs/2312.08968.点此复制

Detecting value-expressive text posts in Russian social media

Detecting value-expressive text posts in Russian social media

评论