BAND is part of SCM’s renowned ADF Modeling Suite, a set of powerful tools used by academic and industrial research chemists, and written in Fortran with MPI parallelisation. ADF wanted to know if POP could help improve parallel performance.
After a POP Audit and two Performance Plans, which analysed various components of BAND, a POP Proof of Concept focussed on improving performance of complex matrix multiplications. The earlier work had determined that for multiplication of small matrices the parallel scaling was limited by underperformance of BLAS/PBLAS routines coupled with a large percentage of time within MPI data transfer.
The Proof of Concept identified and implemented a range of improvements, which included overlapping computation with communication, improved use of BLAS which doubled the speed of computation, and reorganising the algorithm to reduce the amount of data communicated via MPI. The optimised subroutine showed four times speed up, compared to the original code, on eight 36-core compute nodes.