|国家预印本平台
首页|基于文本数据增强的生活满意度预测模型优化

基于文本数据增强的生活满意度预测模型优化

Optimization of a prediction model of life satisfaction based on text data augmentation

中文摘要英文摘要

目的 随着网络大数据以及机器学习的方法的发展,越来越多研究结合文本分析与机器学习来预测满意度。在建立生活满意度预测模型的研究中,针对获取大量有效的有标注数据困难的问题,本研究提出基于文本数据增强以优化生活满意度预测模型。 方法 改编大连理工词典后,以357份生活现状描述为原始文本、生活满意度量表自评分为标注,经过EDA和回译进行文本数据增强,利用传统机器学习算法建立预测模型。 结果 结果显示,大连理工词典改编后,各模型预测能力大大提高;数据增强后,仅在线性回归模型上观察到回译和EDA的提升作用。使用原始数据进行训练的岭回归模型预测值与实际值的皮尔逊相关系数最高,达0.4131。 结论 特征提取精度的提升可优化目前的生活满意度预测模型,但对于以词频为特征建立的生活满意度预测模型,基于回译和EDA进行的文本数据增强可能并不十分适用。

Objective With the development of network big data and machine learning, more and more studies starting to combine text analysis and machine learning algorithms to predict individual satisfaction. In the studies focused on building life satisfaction prediction models, it is often difficult to obtain large amounts of valid and labeled data. This study aims at solving this problem using data augmentation and optimizing the prediction model of life satisfaction. Method Using 357 life status descriptions annotated by self-rating life satisfaction scale scores as original text data. After preprocessing using DLUT-Emotionontology, EAD and back-translation method was applied and the prediction model was built using traditional machine learning algorithms. Results Results showed that (1) the prediction accuracy was largely enhanced after using the adapted version of DLUT-Emotionontology; (2) only linear regression model was enhanced after data augmentation; (3) rigid regression model showed the greatest prediction accuracy when trained by original data (r = 0.4131). Conclusion The improvement of feature extraction accuracy can optimize the current life satisfaction prediction model, but the text data augmentation methods, such as back translation and EDA may not be applicable for the life satisfaction prediction model based on word frequency.

计算技术、计算机技术

生活满意度大连理工词典文本数据增强回译EDA机器学习

Life SatisfactionDLUT-EmotionontologyText data augmentationBack translationEDAMachine learning

.基于文本数据增强的生活满意度预测模型优化[EB/OL].(2024-02-29)[2025-08-11].https://chinaxiv.org/abs/202201.00007.点此复制

评论