|国家预印本平台
首页|一种联合神经网络训练的关键词提取方法

一种联合神经网络训练的关键词提取方法

Keyword extraction method based on Neural Networks with Joint Training

中文摘要英文摘要

关键词提取技术已逐渐成为自然语言处理和信息检索领域的研究热点。许多语言任务都离不开关键词提取技术,如长文本分类、自动摘要、机器翻译、对话系统等。本文提出了一种既具有较强记忆能力又具有泛化能力的关键词提取算法。模型中包含一个线性模型和一个深层神经网络,充分利用线性模型的记忆能力学习统计特征与关键词之间的关系,用深度模型学习关键词向量在文本向量上的投影向量学习关键词和文本之间的语义联系,从而提高模型的泛化能力。最后通过线性模型和深层神经网络的联合训练,提高了模型准确性和鲁棒性。该方法与经典的TF-IDF和TextRank方法进行了比较。在同一批测试集上,提出的模型在准确率、召回率和F值这三项指标都优于基线模型。

Keyword extraction technology has gradually become a hot research problem in Natural Language Processing (NLP) and Information Retrieval. Many language tasks are inseparable from keyword extraction technology, such as long text classification, automatic summary, machine translation, dialogue system, etc. In this paper, we design a keyword extraction algorithm that can combine the benefits of both memorization and generalization. Our model contains a linear model and a deep neural networks. The linear model learns the relationship between statistic features and keywords, which can make full use of the memory capabilities of the shallow model. In the deep component, we feed the projection vector of words on the text to deep neural networks, which can enhance the generalization ability of the model. With the joint training of the linear model and the deep neural networks, our model achieves higher accuracy and scalability. Our method is compared with Frequency, Term Frequency-Inverse Document Frequency (TF-IDF) and TextRank. On the same batch of test dataset, our model is superior to the baseline model in Precision, Recall, and F-score, respectively.

佘春东、刘绍华、尤焕英

计算技术、计算机技术自动化技术经济自动化基础理论

关键词提取神经网络联合训练

Keywords extractdeep learningjoint training

佘春东,刘绍华,尤焕英.一种联合神经网络训练的关键词提取方法[EB/OL].(2020-03-13)[2025-08-16].http://www.paper.edu.cn/releasepaper/content/202003-155.点此复制

评论