|国家预印本平台
首页|基于向量空间模型的古汉语词义自动消歧研究

基于向量空间模型的古汉语词义自动消歧研究

Word Sense Disambiguation of the Ancient Chinese based on Vector

中文摘要英文摘要

解释词义是整理古籍的重要研究内容之一,人工释义费时费力。借鉴现代汉语词义消歧的研究成果,本研究提出了一种改进的向量空间模型词义消歧方法,即在古汉语义项词语知识库的支持下,将待消歧多义词上下文与多义词的义项映射到向量空间模型中,完成语义消歧任务。本文以中国农业古籍全文数据库作为统计语料,对10个典型古汉语多义词,共29个义项,1836条待消歧上下文,进行义项标注的实验,消歧平均正确率为79.5%。

o explain sense of the words is the important part of the arrangement of Chinese ancient books. Manual interpretation is very time-consuming. Learning from the research of modern word sense disambiguation, an improved unsupervised word sense disambiguation method of the ancient Chinese was proposed based on vector space model of senses. In this article, the knowledge base of the senses of the polysemous words in the ancient Chinese was build, and the contexts and the senses of the polysemous words were mapped to the vector space model in order to complete the task of word sense disambiguation. It was full-text database of ancient Chinese agricultural books that used as statistics corpus, and the word sense tagging experiments were taken for ten typical polysemous words of the ancient Chinese, a total of 29 senses, 1836 contexts. The results showed that the average accuracy of word sense disambiguation was 79.5%.

惠富平、侯汉青、张长秀、常娥

汉语语言学

向量空间模型词义消歧古汉语

Vector Space ModelWord Sense Disambiguationthe Ancient Chinese

惠富平,侯汉青,张长秀,常娥.基于向量空间模型的古汉语词义自动消歧研究[EB/OL].(2012-06-01)[2025-08-02].http://www.paper.edu.cn/releasepaper/content/201206-11.点此复制

评论