面向低资源命名实体识别BiLSTM-BCRF模型
BiLSTM-BCRF Model for Low Resource Named Entity Recognition
[目的]当标注数据较少时,现有模型受训练数据量少的限制,参数没有拟合到预期效果,导致在低资源命名实体识别任务中模型识别性能不佳。[方法]本文提出一种融入伯努利分布(Bernoulli distribution)的新型损失函数,让模型较好拟合数据。此外,本文在BiLSTM-CRF模型基础上融合多层字符特征信息,结合基于伯努利分布的新型损失函数,构建了BiLSTM-BCRF模型。[结果]本文提出的BiLSTM-BCRF模型在20%的CoNLL2003和20%的BC5CDR的数据集上,F1值在BiLSTM-CRF模型基础上分别提升了6.16%、3.35%。[结论]该模型能较好地适应低资源命名实体识别任务。[局限]该模型识别专有名词的性能还有待提升。
" [Objective]when there are few labeled data, the existing models are limited by the amount of training data, and the parameters do not fit the expected effect, resulting in poor model recognition performance in the task of low resource named entity recognition. [Methods] a new loss function integrated with Bernoulli distribution is proposed to make the model fit the data better. In addition, based on the BiLSTM-CRF model, multi-layer character feature information is fused, and the new loss function based on Bernoulli distribution is combined to construct the BiLSTM-BCRF model. [Results] Based on the dataset of 20% CONLL2003 and 20% BC5CDR, the F1 value of the BiLSTM-BCRF model proposed in this paper increased by 6.16% and 3.35% respectively. [Conclusion] the model can better adapt to the task of low resource named entity recognition. [Limitations] the performance of this model in identifying proper nouns needs to be improved
计算技术、计算机技术
低资源命名实体识别神经网络伯努利分布
Low resource named entity recognitionNeural networkBernoulli distribution
.面向低资源命名实体识别BiLSTM-BCRF模型[EB/OL].(2022-01-02)[2025-08-02].https://chinaxiv.org/abs/202201.00009.点此复制
评论