Storm下基于最佳并行度的贪心调度算法
开源分布式实时计算框架Storm在互联网、金融、电子商务等领域得到了广泛应用。Storm默认采用轮询的调度策略,且依赖用户对Topology任务的并行度配置,当配置不合理时依然会造成Topology处理时延增大、吞吐量降低等问题。针对该问题,提出了一种Storm下基于最佳并行度的贪心调度算法,调度时先求解Topology任务中各组件的最佳并行度,再采用贪心策略进行调度,以最小化节点间的网络通信开销。通过与默认调度算法、线上调度算法和热边调度算法进行实验比较,结果表明算法能够有效降低Storm处理时延,提高系统吞吐量和资源利用率。
Open-source distributed real-time computing framework Storm in the Internet, finance, e-commerce and other fields has been widely used. By default, Storm uses the polling scheduling policy and relies on the user's configuration of Topology tasks in parallel. When the configuration is unreasonable, Storm still causes delays in Topology processing and decreases throughput. To solve this problem, this paper proposes a Greedy Scheduling Algorithm based on best parallelism in Storm. When scheduling, the best parallelism of each component in Topology task is solved first, and then greedy policy is adopted to minimize the network communication between nodes. Compared with the default scheduling algorithm, the online scheduling algorithm and the hot-side scheduling algorithm, the results show that the algorithm can effectively reduce the Storm processing delay and improve the system throughput and resource utilization.
熊安萍、蒋亚雄、段杭彪
计算技术、计算机技术
实时计算Storm最佳并行度贪心策略调度算法
熊安萍,蒋亚雄,段杭彪.Storm下基于最佳并行度的贪心调度算法[EB/OL].(2018-04-19)[2025-08-11].https://chinaxiv.org/abs/201804.02051.点此复制
评论