Most contemporary shared memory systems expose a non-uniform memory architecture (NUMA) and this has implications for application performance. However, the OpenMP programming model does not provide explicit support for that. This 30-minute live webinar discussed the approaches to getting the best performance from OpenMP applications on such machines. The webinar covered the characteristics of cc-NUMA architectures, the OpenMP Thread Affinity model and the operating system mechanisms of memory placement. It then explained how to use this understanding to achieve performance optimization. The talk included practical examples showing best practices.
The presentation slides are also available here.
About the Presenter
Dr Christian Terboven is a Senior Scientist and HPC Group Manager at RWTH Aachen University. His research interests center around Parallel Programming and related software engineering aspects. Dr Terboven has been involved in the analysis, tuning and parallelization of several large-scale simulation codes for various architectures. He is a member of the OpenMP Language Committee and leads the Affinity subcommittee. He is responsible for several research projects in the area of programming models and approaches to improving the productivity and efficiency of modern HPC systems. He also currently works on the EU Performance Optimisation and Productivity (POP) project.