高性能文本索引系统的设计与实现
esign and implementation of high performance text index system
针对传统文本索引技术空间消耗大、分词不准确等问题,设计并实现了高性能文本索引系统。该系统采用压缩的全文自索引算法,节省了索引空间,避免了自然语言分词方法的影响,配合通配符搜索算法扩展了搜索功能,在众核CPU高性能处理器上可实现多线程并行处理,提高了处理速度。实验结果表明,该系统将文本索引的空间消耗降为原文本的50%左右,具有较高的实用价值。
In view of the traditional text index technology have problem such as too large space consumption and word segmention inaccurate, high performance text index was designed and implemented. The system uses compressed full-text self-index algorithm saves a lot of space of index, avoiding the influence of word segmentation method of natural language, and extends the search function with wildcard search algorithm. The system can realize multi-threaded parallel processing in the high performance processors with many-core CPU, improving the processing speed. Experimental results show that the system has high practical value, it can reduce space consumption of text index to about 50% of original text.
王春露、张宇、刘燕兵、路炜、周美孜
计算技术、计算机技术
计算机应用文本索引全文索引自索引通配符搜索
computer applicationtext indexfull-text indexself-indexwildcard search
王春露,张宇,刘燕兵,路炜,周美孜.高性能文本索引系统的设计与实现[EB/OL].(2013-11-20)[2025-08-21].http://www.paper.edu.cn/releasepaper/content/201311-351.点此复制
评论