Experts of the POP CoE contributed heavily to the tutorial program of the ISC High Performance 2021 conference, which was held virtually.
Judit Gimenez (BSC) and Brian Wylie (JSC) presented the tutorial “Determining Parallel Application Execution Efficiency and Scaling using the POP Methodology”. The methodology was developed and applied over several years within the POP CoE. Its focus is the hierarchy of execution efficiency and scaling metrics that identify the most critical issues and quantify potential benefits of remedies. The metrics can be readily compared and determined by a variety of tools for applications in any language employing standard MPI, OpenMP and other multi-threading and offload paradigms. Widely-deployed open-source tools can be used to demonstrate this process with provided performance measurements of actual HPC application executions (ranging from CFD to neuroscience), which allowed tutorial participants to repeat this on their own computers and prepared them to locate and diagnose efficiency and scalability issues in their own parallel application codes.
Bernd Mohr (JSC) gave an “Introduction to HPC: Applications, Systems, and Programming Models”. In this introductory tutorial, attendees learned what "high performance computing" means and what differentiates it from more mainstream areas of computing. It also introduced major applications that use high performance computing for research and commercial purposes, and explained how AI and HPC interact with each other. Then, it presented the major HPC system architectures needed to run these applications. Finally, it provided with an overview of the languages and paradigms used to program HPC applications and systems.
The topic “Hands-on Practical Hybrid Parallel Application Performance Engineering” was introduced by Markus Geimer and Christian Feld (JSC) together with colleagues from TU Dresden and the University of Oregon. They presented state-of-the-art performance tools for leading-edge HPC systems founded on the community-developed Score-P instrumentation and measurement infrastructure, demonstrating how they can be used for performance engineering of effective scientific applications based on standard MPI, OpenMP, hybrid combination of both, and increasingly common usage of accelerators. Parallel performance tools from the Virtual Institute - High Productivity Supercomputing (VI-HPS) were introduced and featured in hands-on exercises with Score-P, Scalasca, Vampir, and TAU. They presented the entire workflow of performance engineering, including instrumentation, measurement (profiling and tracing, timing and PAPI hardware counters), data storage, analysis, tuning, and visualization. Emphasis was placed on how tools are used in combination for identifying performance problems and investigating optimization alternatives. Participants could conduct exercises in an AWS-provided, containerized E4S [https://e4s.io] environment containing all the necessary tools. This helped to prepare participants to locate and diagnose performance bottlenecks in their own parallel programs.
Finally, Xavier Teruel (BSC) and Christian Terboven (RWTH Aachen) together with colleagues from LLNL and AMD explained “Mastering Tasking with OpenMP”. Since version 3.0 released in 2008, OpenMP offers tasking to support the creation of composable parallel software blocks and the parallelization of irregular algorithms. Developers usually find OpenMP easy to learn. However, mastering the tasking concept of OpenMP requires a change in the way developers reason about the structure of their code and how to expose the parallelism of it. This tutorial addressed this critical aspect by examining the tasking concept in detail and presenting patterns as solutions to many common problems. It presented the OpenMP tasking language features in detail and focused on performance aspects, such as introducing cut-off mechanisms, exploiting task dependencies and preserving locality. All aspects were accompanied by extensive case studies.