|国家预印本平台
首页|基于多属性加权的社会化问答社区关键词提取方法

基于多属性加权的社会化问答社区关键词提取方法

Keywords Extraction Method for the Social Q&A Community Based on Multi-attributes Weighted

中文摘要英文摘要

[目的/意义]现有的关键词提取方法不适应社会化问答社区文本长度较短、内容表述口语化、数据集稀疏的特点,且很少考虑用户关注程度对词语重要性的影响,不能有效地提取此类文本的关键词,因此,提出针对社会化问答社区的多属性加权关键词提取方法。[方法/过程]多属性加权关键词提取方法通过引入调节函数和词性对传统TF-IDF进行改进,并通过线性加权融合用户回答数、关注数、浏览数以及评论数4个用户关注属性来综合度量词语权重。[结果/结论]实验表明,该方法能更有效地提取社会化问答社区文本的关键词。

Purpose/significance] Existing methods of extracting keywords can't be applied to the social Q&A community effectively, because they are not suitable for the characteristics of the social Q&A community which embodies short texts, colloquial contents and sparse data. They rarely think about the impact of users' attention on words. In view of the aforementioned problem, this paper presents a novel keywords extraction method based on multi-attributes weighted for the social Q&A community. [Method/process] This method improved the traditional TF-IDF algorithm by introducing the tuning function and the part of speech. Besides, it calculated the weight of words based on a linear weighting formula, which fused four attributes of user focus by dealing with numbers of users' answer, attention, browse, and comments. [Result/conclusion] Experiments show that this method can extract keywords from the social Q&A community more effectively.

余本功、李婷、杨颖

计算技术、计算机技术

社会化问答社区关键词提取F-IDF多属性加权

social Q&A communitykeyword extractionF-IDFmulti-attributes weighted

余本功,李婷,杨颖.基于多属性加权的社会化问答社区关键词提取方法[EB/OL].(2023-08-26)[2025-08-23].https://chinaxiv.org/abs/202308.00371.点此复制

评论