Rheologic is an Austrian company providing CFD (Computational Fluid Dynamics) services and solutions for complex flows.
Rheologic develop new solvers for the OpenFOAM-framework, in this case uhiSolver (Urban Heat Island Solver). This is a program to forecast local conditions (e.g. thermal comfort) during the hottest days of summer in densely built urban areas including the cooling effects of plants and water surfaces due to evaporation.
uhiSolver calculates and models air-flow with day/night cycles, sun movement across the sky including direct and diffuse radiation as well as reflections, different surfaces’ albedos, buoyancy effects in air flow and evaporative cooling. It is written in C++ and parallelised using MPI.
We found that the performance was very good already with super-linear speedup. However, using the POP methodology we identified room for improving the load balance of the application to further boost the performance.
The load imbalance was found to be due to varied Instructions per Cycle (IPC) across the ranks when in useful computation; that is some of the processes were completing their work at a faster rate than others. This results in idle time for the quicker processes while waiting for the others to catch up.
The slowdown on some processes was due to higher cache miss rates, resulting in longer time fetching data. To improve the cache usage, we suggested improving the temporal and spatial locality of the data. In this case, the decomposition of the model into cells for parallelisation in the initialisation led to the data within a cell being located far apart in memory. The solution was to renumber the mesh after the decomposition to ensure that data that is close in memory is also used close in time to better utilise the cache and avoid long delays fetching data.
Due to this improvement, the application showed a 25% reduction in time-to-solution on 128 cores.