Minjung Kim, Subhasish Mandal, et al.
Computer Physics Communications
The increase in memory capacity is substantially behind the increase in computing power in today's supercomputers. In order to alleviate the effect of this gap, diverse options such as NVM-non-volatile memory (less expensive but slow) and HBM-high bandwidth memory (fast but expensive) are being explored. In this paper, we present a common approach using parallel runtime techniques for utilizing NVM and HBM as extensions of the existing memory hierarchy. We evaluate our approach using matrix-matrix multiplication kernel implemented in CHARM++ and show that applications with memory requirement four times the HBM/DRAM capacity can be executed efficiently using significantly less total resources.
Minjung Kim, Subhasish Mandal, et al.
Computer Physics Communications
Xiang Ni, Scott Schneider, et al.
PPoPP 2019
Nikhil Jain, Abhinav Bhatele, et al.
IPDPS 2017
Kavitha Chandrasekar, Xiang Ni, et al.
IPDPSW 2017