Training on POP performance analysis, methodology and tools by and for women in HPC | Performance Optimisation and Productivity

Date

Tuesday, May 17th to Thursday, May19th, 2022

Location

The workshop will be held online, using the Zoom video conference platform.

Target group

This workshop is targeted to students, researchers and professionals that want to acquire the skills to analyze the performance of their own codes.

It is primarily aimed at women and underrepresented groups in the HPC community and it will be taught and supported by an all-female team.

Some basic knowledge of HPC environment, MPI and/or OpenMP is required to follow the course.

The workshop will be held in English.

Registration

Register here !

The number of participants is limited, early registration will open on 7th of April only for women and underrepresented groups in the HPC community until Monday 9th of May.

If there are free places, from Monday, May 9th to Thursday, May 12th, registration will be open to everyone.

Once registered, attendees will receive a zoom link to connect to the training and instructions to get an account in the cluster.

Organising Institutions

Goals

This workshop organised by the POP CoE and VI-HPS will:

Give an overview of the POP CoE methodology
Explain the functionality of POP performance tools, and how to use them effectively
Offer hands-on experience and expert assistance using the tools in your own application or provided examples and benchmarks

On completion participants will be familiar with the fundamentals of HPC performance analysis and will be able to use the POP performance analysis methodology and tools to better understand the performance of their code. Those who prepared their own application test cases will have been coached in the tuning of their measurement and analysis, and provided optimization suggestions.

Requirements

Zoom
SSH client (to connect HPC systems)
X Server (enabling remote visual tools)
Participants are encouraged to prepare their own MPI, OpenMP and hybrid MPI+OpenMP parallel application codes for analysis.

Programme Overview

The workshop is organized in 3 days, with lectures and demos in the morning and hands-on sessions in the afternoon. It will run from 09:00 to 16:00 CEST each day, with breaks.

Day1: Introduction to HPC Performance Analysis and BSC tools (17th May)

Introduction to HPC performance analysis, POP, and VI-HPS
Training on BSC Tools (Extrae, Paraver, Dimemas, BasicAnalysis)
(lunch break)
Hands-on session: BSC tools

Day 2: Intro to JSC tools and The POP methodology (18th May)

The POP methodology
Training on JSC Tools (Score-P, Scalasca) and Vampir
(lunch break)
Hands-on session: JSC tools

Day 3: PAPI & Darshan (19th May)

Hardware Performance counters with PAPI
I/O analysis with Darshan
(lunch break)
Hands-on session: PAPI and Darshan

Training team

The workshop will be taught and supported by an all-female trainers team of HPC experts from POP CoE:

Marta Garcia-Gasulla (Barcelona Supercomputing Center)
Judit Gimenez (Barcelona Supercomputing Center)
Sandra Mendez (Barcelona Supercomputing Center)
Anara Kozhokanova (RWTH Aachen University)
Radita Liem (RWTH Aachen University)
Anke Visser (Jülich Supercomputing Centre)
Christina Muehlbach (TU Dresden)

Hardware and Software Platforms

JUSUF: x86 Linux modular cluster system:

Cluster: 144 compute nodes each with dual AMD EPYC 7742 'Rome' processors (2.25GHz, 64 cores per processor) and 256 GB RAM, Mellanox HDR100 InfiniBand network
parallel filesystem: GPFS (SCRATCH & WORK)
software: Rocky Linux 8; ParaStation & Intel MPI; Intel & GCC compilers; SLURM batchsystem

The JSC HPC system JUSUF is the primary platform for the workshop and will be used for the hands-on exercises. Course accounts will be provided during the workshop to participants without existing accounts. Other systems where up-to-date versions of the tools are installed can also be used when preferred, though support may be limited and participants are expected to already possess user accounts on non-local systems. Regardless of whichever external systems they intend to use, participants should be familiar with the relevant procedures for compiling and running their parallel applications (via batch queues where appropriate).

[Note] For this training, GPU nodes are not provided. Code analysis can still be done from other aspects except for the GPU performance part.