压缩对Hadoop性能影响研究
ompression on Hadoop:A Case Study of Improving I/O Performance on Hadoop
压缩是I/O调优的一个重要方法,它能减少I/O的计算负载,从而提高I/O的性能。当今,磁盘I/O的发展速度永远赶不上有着摩尔定律发展的CPU速度,所以I/O常常成为数据处理的瓶颈。在Hadoop中,如何使用压缩来进行I/O调优还未被完全研究。本文通过实验,得出了一个压缩使用策略来帮助Hadoop的使用者来确定何时何地使用压缩以及使用何种压缩。基于这个策略,有些Hadoop应用在合理使用压缩后能提高达65%的效率。
ompression is an important method to optimize I/O , it can reduce I/O load to improve I/O performance .Now, the development of disk I/O speed never catch up with that of CPU with a Moore's law ,so the I/O often become the bottleneck of data processing. In Hadoop, how to use compression to optimize I/O has't been studied fully. In this paper, a compression-using policies is worked out to help the Hadoop users to determine when and where to use compression and use which one. Based the policy, some Hadoop applications with avalibable compression can improve the efficiency to 65%.
向丽辉、缪力
计算技术、计算机技术
HadoopMapReduceI/O压缩
HadoopMapreduceI/OCompression
向丽辉,缪力.压缩对Hadoop性能影响研究[EB/OL].(2013-05-09)[2025-06-23].http://www.paper.edu.cn/releasepaper/content/201305-125.点此复制
评论