Scalable communication for high-order stencil computations using
CUDA-aware MPIParallel Computing (Parallel Comput.), 2021 |
TEMPI: An Interposed MPI Library with a Canonical Representation of
CUDA-aware DatatypesIEEE International Symposium on High-Performance Parallel Distributed Computing (HPDC), 2020 |