|国家预印本平台
首页|一种改进的模糊时间间隔序列模式算法

一种改进的模糊时间间隔序列模式算法

n improved fuzzy time interval sequence pattern mining algorithm

中文摘要英文摘要

序列模式挖掘的提出对于发现事务频繁出现的模式具有重要意义,随着数据挖掘精度要求的提升,序列模式挖掘发展了众多拓展研究方向。其中,带有时间间隔的序列模式挖掘在现实生活中存在大量应用场景,如银行流水数据的分析、Web日志的分析等。但目前对于带有时间间隔序列模式挖掘的研究存在时间间隔差距过大时挖掘精度差、需要依靠经验设定临界值等问题。因此,本文提出一种模糊时间间隔序列模式挖掘算法TiCMiner。本算法的主要思想是对时间间隔进行模糊聚类处理,从而对时间间隔进行划分,再对处理过后的时间间隔序列进行序列模式挖掘。由于增加了时间间隔的处理复杂度,为避免影响算法运行效率,本文针对带有时间间隔的序列模式挖掘改进了连接与剪枝方式。最后在真实银行流水数据集与经典序列模式挖掘数据集中进行测试,实验表明算法在没有增加运行时间的基础上,提升了序列模式挖掘的精确度。

Sequential pattern mining is of great significance for discovering frequent patterns of transactions. With the requirements of data mining accuracy, sequential pattern mining has developed many research directions. Among them, sequential pattern mining with time intervals has a large number of application scenarios in real life, such as bank pipeline data analysis, Web log analysis and so on. However, at present, there are some problems in the research of mining sequential patterns with time intervals, such as poor mining accuracy when the time interval gap is too large, especially when the critical value has to rely on human estimation. Therefore, this paper proposes a mining algorithm for sequential patterns with fuzzy time intervals, TiCMiner. The main idea of this algorithm is to divide the time interval by means of fuzzy clustering, after that Sequential pattern mining is carried out for the sequence with time interval. To avoid increasing the processing time, this paper improve the connecting and pruning methods for sequential pattern mining at the same time. Finally, the algorithm is tested in the real bank pipeline data, which shows that the algorithm improves the processing accuracy of the time interval without increasing the running time.

郭燕慧、葛慧晗

计算技术、计算机技术

计算机技术序列模式挖掘模糊c聚类时间间隔

omputer TechnologySequential Pattern MiningFCMTime interval

郭燕慧,葛慧晗.一种改进的模糊时间间隔序列模式算法[EB/OL].(2019-01-22)[2025-08-21].http://www.paper.edu.cn/releasepaper/content/201901-134.点此复制

评论