SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
Recent advancements in data stream processing frameworks have improved real-time data handling, however, scalability remains a significant challenge affecting throughput and latency. While studies have explored this issue on local machines and cloud clusters, research on modern high performance computing (HPC) infrastructures is yet limited due to the lack of scalable measurement tools. This work presents SProBench, a novel benchmark suite designed to evaluate the performance of data stream processing frameworks in large-scale computing systems. Building on best practices, SProBench incorporates a modular architecture, offers native support for SLURM-based clusters, and seamlessly integrates with popular stream processing frameworks such as Apache Flink, Apache Spark Streaming, and Apache Kafka Streams. Experiments conducted on HPC clusters demonstrate its exceptional scalability, delivering throughput that surpasses existing benchmarks by more than tenfold. The distinctive features of SProBench, including complete customization options, built-in automated experiment management tools, seamless interoperability, and an open-source license, distinguish it as an innovative benchmark suite tailored to meet the needs of modern data stream processing frameworks.
Apurv Deepak Kulkarni、Siavash Ghiasvand
计算技术、计算机技术
Apurv Deepak Kulkarni,Siavash Ghiasvand.SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure[EB/OL].(2025-04-03)[2025-07-16].https://arxiv.org/abs/2504.02364.点此复制
评论