|国家预印本平台
首页|基于聚类算法的Web日志挖掘

基于聚类算法的Web日志挖掘

Web Log Mining based on Clustering Algorithm

中文摘要英文摘要

随着互联网的迅速发展,WWW网站中的日志信息现已呈现出爆炸式增长趋势。为了能充分挖掘Web日志中潜在的有效信息,本文提出一种Web日志挖掘技术与聚类算法相结合的日志挖掘方法,并将该方法应用于具体的影视网站中。本文首先对获取的日志数据源进行数据清理、用户识别、基于时间的用户会话识别、提取用户兴趣度特征,然后基于改进的K-means聚类算法对用户进行聚类,进行模式分析,最终结合个性化推荐技术向用户推荐热门影片。实验结果表明,本文改进的K-means算法收敛速度明显改善,同时得到的聚类效果呈现了很高的内聚性。

With the rapid development of the Internet, the log information in WWW website is now showing an explosive growth trend. In order to fully exploit the potential effective information in Web logs, this paper presents a log mining method which combines web log mining and clustering algorithm, and applies this method to specific video websites. Firstly, this method performs data cleaning on the obtained log data, user identification, time-based user session recognition, and extracting user interest feature. Then, this paper improves the K-means clustering algorithm to cluster the users and performs the model analysis. Finally, use the personalized recommendation technology to recommend popular movies to users. The experimental results show that the improved K-means algorithm can improve the convergence speed, and the clustering result shows high cohesion.

谢东亮、徐翔

计算技术、计算机技术

Web日志挖掘K-means聚类算法影视推荐

web log miningK-means cluster algorithmmovie recommendation

谢东亮,徐翔.基于聚类算法的Web日志挖掘[EB/OL].(2016-12-12)[2025-08-10].http://www.paper.edu.cn/releasepaper/content/201612-245.点此复制

评论