|国家预印本平台
首页|基于本体的文档信息检索模型优化

基于本体的文档信息检索模型优化

Optimization of Ontology-based Document Information Retrieval Model Optimization

中文摘要英文摘要

针对基于关键字匹配的传统文档检索模型存在查询关键词必须与文档索引关键词完全匹配的问题和无法检索出包含与检索关键字存在同义关系和包含关系的文档的问题,提出在查询阶段和文档相似度匹配阶段引入包含与查询输入语句处于同一领域的本体的改进文档信息检索模型。为了解决查询关键词必须与文档索引关键词完全匹配问题,提出在查询过程中引入包含与查询输入语句处于同一领域的本体对查询输入关键词进行查询扩展。为了解决无法检索出包含与检索关键字存在同义关系和包含关系文档的问题,提出使用特定领域本体的改进向量空间模型进行文档相似度匹配。实验表明,通过使用改进的基于本体的文档信息检索模型,可以明显提高文档的召回率和精确率。尤其是对于一词多义的关键词,其召回率提高尤为明显。

For the traditional document retrieval model based on keyword matching, there is a problem that the query keyword must exactly match the document index keyword, and the problem that the document containing the synonym relationship and the inclusion relationship with the search keyword cannot be retrieved is raised. The document similarity matching phase introduces an improved document information retrieval model that contains ontology in the same domain as the query input sentence. In order to solve the problem that the query keyword must exactly match the document index keyword, it is proposed that the query includes the ontologies in the same domain as the query input sentence to query and expand the query input keyword. In order to solve the problem that it is impossible to retrieve the existence of a synonymous relationship and a relational document containing search keywords, an improved vector space model based on domain-specific ontology is proposed for document similarity matching. Experiments show that by using an improved ontology-based document information retrieval model, the recall rate and accuracy of the document can be significantly improved. In particular, for the polysemy keywords, the recall rate is particularly noticeable.

陈乙雄、刘吉双

计算技术、计算机技术

信息检索本体查询扩展语义相似性

information retrievalontologyretrieval expansionsemantic similarity

陈乙雄,刘吉双.基于本体的文档信息检索模型优化[EB/OL].(2018-03-26)[2025-08-18].http://www.paper.edu.cn/releasepaper/content/201803-225.点此复制

评论