|国家预印本平台
首页|基于层次自适应的文本分类技术的研究

基于层次自适应的文本分类技术的研究

RESRARCH OF THE TEXT CATEGORIZATION TECHNIQUES BASED ON ADAPTIVE HIERARCHICAL

中文摘要英文摘要

文本分类技术作为一种有效组织信息、方便信息定位的技术,在近十几年获得了长足发展。本文建立了一个基于层次自适应的分类器,来对多层次类别的数据进行分类。在高层训练时根据每类文本数量的不同数量采用随机选取的方法进行数据均匀化,在底层则按比例取文档数据进行训练,分类时采用决策树的方法对文档进行分类得到最终结果。实验结果证明,层次自适应分类比单纯的层次分类结果要好。

s an effective technology for organizing and positioning of information, text category gained rapid development in the last decade. This paper establishes a classification based on hierarchical adaptive to the multi-level categories of data classification. At higher level, a method of randomly selection according to the number of different category is used for data homogenization. At the lower level, data is chosen proportionally. Decision Trees method is used to classify the document to get the final result. Experimental results show that the adaptive hierarchical classification performs better than simply hierarchical classification.

崔冠宁、白中英

计算技术、计算机技术

文本分类层次分类层次自适应向量空间模型

vertical-Bell laboratory layered space-timeadaptive modulationloading algorithmpower constraint

崔冠宁,白中英.基于层次自适应的文本分类技术的研究[EB/OL].(2013-07-12)[2025-08-16].http://www.paper.edu.cn/releasepaper/content/201307-191.点此复制

评论