|国家预印本平台
首页|基于TextRank的自动摘要优化算法

基于TextRank的自动摘要优化算法

中文摘要英文摘要

在对中文文本进行摘要提取时,传统的TextRank算法只考虑节点间的相似性,忽略了文本的其他重要信息。首先,针对中文单文档,在现有研究的基础上,使用TextRank算法,一方面考虑句子间的相似性,另一方面,使TextRank算法与文本的整体结构信息、句子的上下文信息等相结合,如文档句子或者段落的物理位置、特征句子、核心句子等有可能提升权重的句子,来生成文本的摘要候选句群;然后对得到的摘要候选句群做冗余处理,以除去候选句群中相似度较高的句子,得到最终的文本摘要。最后通过实验验证,该算法能够提高生成摘要的准确性,表明了该算法的有效性。

When Abstract: ng Chinese texts, the traditional TextRank algorithm only considers the similarity between nodes and neglects other important information of the text. Firstly, aiming at Chinese single document, on the basis of existing research, this paper uses TextRank algorithm, on the one hand, it considers the similarities between sentences, on the other hand, TextRank is combined with the overall structural information of texts and the contextual information of sentences, such as the physical position of the document sentences or paragraph, feature sentences, core sentences and other sentences that may increase the weight of the sentence, all are used to generate the digest candidate sentence group of the text. And then, removing high-similarity sentences by redundancy processing technology on the digest candidate sentence group. Finally, the experimental verification shows that the algorithm can improve the accuracy of the generated digest, indicating the effectiveness of the algorithm.

刘伟童、刘培玉、李娜娜、刘文锋

10.12074/201804.02053V1

计算技术、计算机技术自动化技术经济自动化基础理论

摘要提取extRank结构信息候选摘要句群冗余处理

刘伟童,刘培玉,李娜娜,刘文锋.基于TextRank的自动摘要优化算法[EB/OL].(2018-04-19)[2025-08-23].https://chinaxiv.org/abs/201804.02053.点此复制

评论