POP expert Michael Knobloch presented an overview about tool support for programing GPUs in the Modular Supercomputing Architecure (MSA) Seminar at Jülich Supercomputing Centre (JSC) on Dec 3, 2019. The presentation was targeting JSC staff and users and therefore concentrated on Nvidia GPUs used in JSC Intel and AMD clusters, but as this architecture is widely used, the information presented should be useful for many developers.
Based on the saying "Make it work, make it right, make it fast!" by Kent Beck, the presentation first covered Integrated Developement Environments (IDEs), then debuggers, and finally performance tools. The following two slides nicely summarize the GPU support of debuggers and performance tools based on the programming system used for the GPU parallelisation:
Indirect support via CUDA (Nvidia only) means that OpenACC is not directly supported, but on Nvidia GPUs the tool uses CUDA to collect and show information about OpenACC kernels. Prototype with non-public OMP runtime means that the OMPD/OMPT support of these tools is currently prototyped but not publically available, as there are no offical OMP runtimes yet which support these interfaces.
The slides for the full presentation are available here.