|国家预印本平台
首页|基于加权朴素贝叶斯的邮件过滤方法

基于加权朴素贝叶斯的邮件过滤方法

Novel Spam Filtering Method Based on Weighted Naive Bayes

中文摘要英文摘要

通过对内容邮件过滤技术中MI特征提取算法研究,结合朴素贝叶斯分类算法,本文提出了特征项区分度的概念,深入分析特征项在分类中区分能力之间的差异,提出一种兼顾特征项区分度和互信息的特征提取算法;通过进一步将区分度添加到分类算法设计中,最终提出一种加权朴素贝叶斯算法,高效地解决基于内容邮件过滤问题。实验结果证明,改进后的算法在召回率、精确率和正确率上均有明显提高,且分类性能更加稳定。

In allusion to MI feature extration algorithm and naive bayes algorithm, this paper introduces the concept of the Feature Term Discrimination(FTD),analyse the discrepancy of the separating capacity of various feature terms in the categorizing process, and put forward a kind of feature extraction algorithm which give consideration to both FTD and MI. By further add FTD to the design of classification algorithm,a weighted na?ve bayes algorithm is presented to solve the problem of content-base filtering efficiently.The experimental results show that the improved algorithm has increased significantly in terms of the recall rate, precision rate and accuracy rate , and the performance of classification is more stable.

黄自威、王辉

计算技术、计算机技术

垃圾邮件特征提取特征项区分度加权朴素贝叶斯

Spam Feature extraction Feature items discrimination Weighted na?ve bayes

黄自威,王辉.基于加权朴素贝叶斯的邮件过滤方法[EB/OL].(2015-02-04)[2025-08-06].http://www.paper.edu.cn/releasepaper/content/201502-47.点此复制

评论