|国家预印本平台
首页|zszpClean:一种基于规则的数据清洗方案

zszpClean:一种基于规则的数据清洗方案

zszpClean: A Rule-Based Solution to Data Clean

中文摘要英文摘要

数据清洗是提高数据集成数据质量的一个重要手段。提出了一种基于动态规则的数据清洗方案AzszpClean,这种方法对各种清洗规则进行动态编译,将数据转换和数据清洗两者结合起来,强化清洗过程的描述能力,同时采用规则队列的方式实现批量规则匹配。实际应用表明,AzszpClean方法可以完成硬编码的功能,但具有更高的实现效率。

ata cleaning is one of the important methods to improve the data quality in data integration. An approach to data clean called AzszpClean is proposed, which is based on the data clean rules. The method combines the data transformation and data filtering, and compiles the clean rules dynamically. The method adopts a rule queue to filter the data with multiple rules in batch. The results from real applications show that, AzszpClean outperforms hard coding with more efficiency of implementation.

李俊奎、王元珍、李专

计算技术、计算机技术

数据清洗,动态规则编译,规则队列

ata Clean Dynamic Rule Compiling Rule Queue

李俊奎,王元珍,李专.zszpClean:一种基于规则的数据清洗方案[EB/OL].(2007-05-11)[2025-08-03].http://www.paper.edu.cn/releasepaper/content/200705-155.点此复制

评论