|国家预印本平台
首页|ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace

ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace

ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace

来源:Arxiv_logoArxiv
英文摘要

Vector architectures are essential for boosting computing throughput. ARM provides SVE as the next-generation length-agnostic vector extension beyond traditional fixed-length SIMD. This work provides a first study of the maturity and readiness of exploiting ARM and SVE in HPC. Using selected performance hardware events on the ARM Grace processor and analytical models, we derive new metrics to quantify the effectiveness of exploiting SVE vectorization to reduce executed instructions and improve performance speedup. We further propose an adapted roofline model that combines vector length and data elements to identify potential performance bottlenecks. Finally, we propose a decision tree for classifying the SVE-boosted performance in applications.

Ruimin Shi、Gabin Schieffer、Maya Gokhale、Pei-Hung Lin、Hiren Patel、Ivy Peng

计算技术、计算机技术

Ruimin Shi,Gabin Schieffer,Maya Gokhale,Pei-Hung Lin,Hiren Patel,Ivy Peng.ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace[EB/OL].(2025-05-14)[2025-06-14].https://arxiv.org/abs/2505.09462.点此复制

评论