|国家预印本平台
首页|Accelerator Codesign as Non-Linear Optimization

Accelerator Codesign as Non-Linear Optimization

Accelerator Codesign as Non-Linear Optimization

来源:Arxiv_logoArxiv
英文摘要

We propose an optimization approach for determining both hardware and software parameters for the efficient implementation of a (family of) applications called dense stencil computations on programmable GPGPUs. We first introduce a simple, analytical model for the silicon area usage of accelerator architectures and a workload characterization of stencil computations. We combine this characterization with a parametric execution time model and formulate a mathematical optimization problem. That problem seeks to maximize a common objective function of 'all the hardware and software parameters'. The solution to this problem, therefore "solves" the codesign problem: simultaneously choosing software-hardware parameters to optimize total performance. We validate this approach by proposing architectural variants of the NVIDIA Maxwell GTX-980 (respectively, Titan X) specifically tuned to a predetermined workload of four common 2D stencils (Heat, Jacobi, Laplacian, and Gradient) and two 3D ones (Heat and Laplacian). Our model predicts that performance would potentially improve by 28% (respectively, 33%) with simple tweaks to the hardware parameters such as adapting coarse and fine-grained parallelism by changing the number of streaming multiprocessors and the number of compute cores each contains. We propose a set of Pareto-optimal design points to exploit the trade-off between performance and silicon area and show that by additionally eliminating GPU caches, we can get a further 2-fold improvement.

Tobias Grosser、Sanjay Rajopadhye、Hristo Djidjev、Nirmal Prajapati、Rumen Andonov、Nandkishore Santhi

计算技术、计算机技术

Tobias Grosser,Sanjay Rajopadhye,Hristo Djidjev,Nirmal Prajapati,Rumen Andonov,Nandkishore Santhi.Accelerator Codesign as Non-Linear Optimization[EB/OL].(2017-12-13)[2025-08-23].https://arxiv.org/abs/1712.04892.点此复制

评论