POP Newsletter 15 - Issue June 2020

Welcome to the 15th newsletter from the EU POP Centre of Excellence. In this edition, we introduce our POPCasts, a series of interviews which provide an insight into the POP service. We also present four technical blogs, one of which discusses the latest release of the Darshan I/O profiling tool, which is now able to profile parallel HDF5.

If you would like to contribute technical content for this newsletter on the topic of parallel performance profiling, please email us at pop-helpdesk@bsc.es.

This issue includes:

  • POP Webinar - Inclusive Leadership and Inspiring Action and Innovation on Wednesday 8 July 2020, 2:00 PM BST | 3:00 PM CEST;
  • POPCasts – interviews providing insights into various aspects of the POP service:
    • Craig Lucas - POP Business Development Manager and Performance Engineering Manager at NAG;
    • Ania Brown – Research Software Engineer, Oxford University;
    • Phil Tooley – HPC Application Analyst, NAG.
  • Technical Blogs:
    • Using the Python Extrae API to profile a region of a code;
    • Cost Efficient Cloud HPC – Using NUMA Efficiently;
    • Darshan HDF5 Profiling;
    • Instantaneous Parallelism in Paraver.
  • POP Out and About – meet POP members face to face at the following events:
    • Teratec 2020 Forum. Ecole Polytechnique, Paris, France. October 13 - 14, 2020;
    • NAFEMS 2020 UK Conference, Milton Keynes, UK. November 9 - 10, 2020.
  • The POP Helpdesk.

For past editions of the newsletter, see the POP newsletter web page.

POP Webinar - Inclusive Leadership and Inspiring Action and Innovation

Wednesday 8 July 2020, 2:00 PM – 2:40 PM BST | 3:00 PM – 3:40 PM CEST

Is 2020 your year for improving equity, diversity and inclusion? Or the year that you hope your research makes their big breakthrough? Are you hoping to get more out of your peers and colleagues and team? Diversity is now a buzzword that gets attention wherever you go. But actually embarking on a programme to hire and retain diverse talent is not as easy as it first seems! In this talk I will discuss why equity, diversity and inclusion and achieving success for your team, innovation and research are not mutually exclusive.

I will share my experiences of setting up an internationally recognised movement addressing inclusion - Women in High Performance Computing (WHPC), including what I wish I had done differently. I will finish with a session on inclusive leadership, and how being an exceptional, inclusive, leader can inspire research, innovation and the careers of those around them.

About the presenter

Dr Toni Collis is the CEO of Collis-Holmes Innovations and Chair of Women in High Performance Computing (WHPC). Toni is a Strategic Leader, Trainer, Consultant and Leadership Coach. With a background in Physics, Toni’s professional career has focused on facilitating the use of parallel computing and supercomputers for the advancement of research and innovation in both academia and industry. Early on in her career, Toni realised that knowledge was not the only barrier to the uptake of parallel computing in research, but that culture limited the participation of women and minorities. As Chair and Co-Founder of WHPC, Toni developed and led innovations aiming to diversify the HPC workforce, providing HPC tutorials for women academics and students around the world, training for inclusive workforces and research into how to improve the representation of women. In early 2019, Toni focused on her passion for broadening diversity & inclusion in the technology industry and now offers Strategy, Coaching, Training and Consultancy for Women Leaders and their allies, with a personal goal of assisting 2000 women into leadership in tech in the next 5 years.

Click on this link for registration.

POPCasts – Conversations about POP

The POP Team have been busy working on a series of interviews, or POPCasts. The interviews shine a light on what we do, clarify requirements and solutions from a user’s perspective, and offer insights to team roles. View the POPCasts on the POP YouTube Channel here.

Craig Lucas - POP Business Development Manager, NAG

In the first of the series, Craig Lucas, POP Business development manager and Performance Engineering Manager at NAG, chats to Fouzhan Hosseini, HPC Application Analyst. Craig provides an overview of the project and the objectives set by the POP Centre of Excellence in promoting best practice in parallel programming. Craig details the methodology and tools used to help users improve their software.

Click here to watch the POPCast.

Ania Brown – Research Software Engineer, Oxford University

The second POPCast provides an interesting insight into a POP user’s perspective. Fouzhan chats to Ania Brown, Research Software Engineer at Oxford e-Research Centre. Ania has been working on the GS2 code for simulating turbulence in plasma. There was an urgent need to further improve scaling for simulating larger experimental systems than they currently had. Ania discusses the process by which POP assisted, from the initial reporting; the training provided for the tools; through to the final successful delivery.

Click here to watch the POPCast.

Phil Tooley – HPC Application Analyst, NAG

In the third POPCast, Phil Tooley, HPC Application Analyst, chats to Fouzhan about his varied role within the POP project and how he assists academic and industrial users on a huge variety of codes from all domains. He provides an overview of the new POP methodology and the metrics implemented to provide efficient performance assessments for users. A very varied and exciting role indeed!

Click here to watch the POPCast.

Technical Blogs

Using the Python Extrae API to Profile a Region of a Code

The Extrae profiling tool, developed by BSC, can very quickly produce very large trace files, which can take several minutes to load into Paraver, the tool used to view the traces. These trace files can be kept to a more manageable size by using Extrae’s API to turn the tracing on and off as needed. For example, the user might only want to record data for two or three time steps. This API was previously only available for codes developed in C, C++ and Fortran but now it also supports Python codes using MPI.

This article describes how to use the Extrae Python API.

Cost Efficient Cloud HPC – Using NUMA Efficiently

The emergence of cloud computing has revolutionized hi-tech business, but as technical power and complexity grows, so do the risks - a single misconfiguration can make efficiency plummet and costs soar. Correct NUMA (non-uniform memory access) layout is key to efficient application performance on large cloud servers – if you are doing HPC in the cloud you need to know about NUMA.

Read the article here on the importance of NUMA.

Darshan HDF5 Profiling

Understanding the I/O behaviour of HPC applications is critical to ensuring their efficient use of storage system resources. However, this is a challenging task given the growing depth and complexity of the I/O stack on these systems, where multiple software layers often coordinate to produce optimized I/O workloads for the underlying storage hardware. Darshan is a lightweight I/O characterization tool that helps users navigate this complex landscape by producing condensed summaries. The Darshan tool can now profile the HDF5 layer, giving application developers more insight into how efficiently their application is using the HDF5 I/O library.

Click here to read about Darshan and its new HDF5 profiling feature.

Instantaneous Parallelism in Paraver

When viewing a trace file in Paraver’s timeline view, it might be interesting to determine the level of parallelism within that time frame, i.e. how many MPI processes are doing computation at a given instant. If an MPI process is not doing computation, then it could be doing communication, I/O or be in an idle or waiting state. This will help identify where in the timeline the application is experiencing poor load balance.

Click here for details of how to view the instantaneous parallelism feature in Paraver.

POP Out and About – Meet POP Members Face to Face at the Following Events

POP will be attending the following events. If you would like to meet a member of the POP team, please email pop-helpdesk@bsc.es and we will happily arrange a meeting with you.

TERATEC 2020 Forum | 13 – 14 October 2020

The TERATEC Forum is a major event in France that brings together international experts in HPC, simulation and big data. It welcomes more than 1,300 attendees, highlighting the technological and industrial dynamism of HPC and the essential role that France plays in this field.

For more information on the conference, please click here.

NAFEMS 2020 | 9 – 10 November 2020

The 2020 NAFEMS UK Conference will be covering topics ranging from traditional FEA and CFD, to new and emerging areas including artificial intelligence and machine learning. NAFEMS will be bringing all those involved in analysis and simulation together, from every corner of industry and academia, giving attendees an opportunity to advance their knowledge.

POP will be an exhibitor at the conference and will be giving the talk “Parallel Engineering Codes: Performance Optimisation with POP Methodology”.

For more information on the conference, please click here.

If you feel that POP should be attending an event, please contact us at pop-helpdesk@bsc.es - suggestions are most welcome!

Apply for Free Help with Code Optimisation

We offer a range of free services, from profiling, code optimisation to training, designed to help EU organisations improve the performance of parallel software. If you are not getting the performance you need from parallel software or would like to review the performance of a parallel code, please apply for help via the short Service Request Form or email us to discuss the service further and how it can be beneficial.

These services are funded by the European Union Horizon 2020 research and innovation programme so there is no direct cost to our users.

The POP Helpdesk

Past and present POP users are eligible to use our email helpdesk (pop-helpdesk@bsc.es). Please contact our team of experts for help analysing code changes, to discuss your next steps, and to ask questions about your parallel performance optimisation.