Steuwer Michel, Haidl Michael, Breuer Stefan, Gorlatch Sergei
Forschungsartikel (Zeitschrift) | Peer reviewedThe implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the OpenCL standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for par- ticular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned OpenCL code.
Gorlatch, Sergei | Professur für Praktische Informatik (Prof. Gorlatch) |
Haidl, Michael | Professur für Praktische Informatik (Prof. Gorlatch) |
Steuwer, Michel | Institut für Informatik |