GS2 is an initial value gyrokinetic simulation code developed to study low-frequency turbulence in magnetized plasma. It is parallelised using MPI with the simulation domain decomposed across tasks. The optimal domain decomposition is non-trivial, and complicated by the different requirements of the linear and non-linear parts of the calculations. GS2 users currently choose a data layout, and are guided towards processor count that are efficient for linear calculations. These choices can, however, lead to data decompositions that are relatively inefficient for the non-linear calculations. We have analysed the performance impact of the data decompositions on the non-linear calculation and associated communications. This has helped us to optimise the decomposition algorithm by using unbalanced data layouts for the non-linear calculations whilst maintaining the existing decompositions for the linear calculations, which has completely eliminated communications for parts of the non-linear simulation and improved performance by up to 15% for a representative simulation.
View on arXiv