September 19th - 20th, I had the chance to attend the 15th International Parallel Tools Workshop (IPTW) in Dresden. While not actually a POP activity, POP partners have a long history of presenting their latest research on HPC tools at this venue. After a two years break due to a certain pandemic, the organizers and the community put together an exceptionally rich programme.
If I had to pick only three of the presentations, my subjective and heavily biased reading list would include work on determining the critical path through a task-based application, efforts on leveraging Jupyter notebooks for including application profiling in teaching, and a user case-study on chasing down communication issues in a large application.
But this is a POP blog, so how are these relevant for POP?
Since day one, POP has been aiming to define a set of performance metrics. Over the years we have come up with different sets for MPI und hybrid MPI+OpenMP which are based on application tracing. This method is being challenged by current developments in HPC, as for instance Tasking or GPU offloading. A series of works, have suggested to propagate Lamport clock like counters for useful computation through the application at runtime with the aim to determine the critical path of a given run. Now, the paper "OTF-CPT: Application Insights Gained from Real-time Critical Path Analysis" extends this idea to OpenMP tasks and propagates the clock through chains on interdependent tasks. This immediately sheds light on the question how dependencies between tasks actually limit performance ... and nicely opens up another level in our hierarchy of POP metrics.
Honestly, I did not expect to hear about JupyterLab at this venue. So the talk on "Integrating performance analysis in JupyterLab for OpenMP and MPI" really piqued my interest. In my daily work with HPC users, one of the biggest challenges is to find a common language to talk about performance issues at all. Naturally, most domain scientist do not have any training in performance metrics and methodology. I am therefore very grateful, that educators (not us!) are incorporating performance analysis into their teaching programs at under-graduate level and, here, even in adult education.
And finally, it was a pleasure to watch a domain scientist and a tool's guy presenting on the same stage their joint endeavour of "Exploring Multi-threaded Communication Behavior of a Large-Scale CFD Solver with Vampir". After all, this fruitful collaboration between communities is what we are striving for.
For me: most enjoyable meeting this year!
-- José Gracia (HLRS)