基于有效上下文信息的变体词还原方法
在社交网络上,用户常创造一些变体词来替代部分实体名词,将这些变体词还原为原目标词是自然语言处理中的一项重要工作。针对现有变体词还原方法准确率不够高的问题,提出了基于有效上下文信息的变体词还原方法。该方法利用点互信息抽取出变体词和候选目标词的有效上下文信息,并将其融合进自编码器模型中,获得变体词和候选目标词更准确的编码,并依据此计算相似度进行候选目标词排序,更准确的实现了变体词还原任务。实验表明,该方法较当前主流的几种方法相比效果有显著提升,提高了变体词还原的准确率。
In social networks, people often creates morphs to replace some entity names. How to resolve these morphs to their real target entities is a very important task for natural language processing. In order to overcome the shortcomings that existing methods cannot resolve morphs accurately, this paper proposed a morph resolution method based on effective context information. This method extracted the effective context information of morphs and target candidates, and integrated the effective context information into autoencoders in order to get more accurate embedding of morphs and their target candidates. This method then calculate the similarity between morphs and target candidates based on the accurate embeddings, and ranked the target candidates according to the similarity. The experiments show that this approach significant outperforms the state-of-the-art methods and improves the accuracy of morph resolution.
沙灜、王斌、游绩榕、梁棋
计算技术、计算机技术
变体词变体词还原自编码器有效上下文信息词嵌入神经网络
沙灜,王斌,游绩榕,梁棋.基于有效上下文信息的变体词还原方法[EB/OL].(2018-04-17)[2025-08-04].https://chinaxiv.org/abs/201804.02159.点此复制
评论