Author: Lei Zhou Title: Implementation of the Partitioned Global Address Space Model for Multi-GPU Systems Abstract: The Partitioned Global Address Space (PGAS) approach is a promising programming model in high performance parallel computing that combines the advantages of distributed memory systems and shared memory systems. The PGAS model has been used on a variety of hardware platforms in the form of PGAS programming languages like Unified Parallel C, Chapel and Fortress. However, in spite of the increasing adoption in distributed and shared memory systems, the extension of the PGAS model to accelerator platforms is still not well supported. To exploit the immense computational power of multi-GPU systems, this work proposes the design of a Partitioned Global Address Space model for multi-GPU systems. Several issues of combining logically separated GPU memories on multiple graphic cards are addressed. Furthermore, the execution model of modern GPU architectures is studied, based on which a task creation mechanism supporting heterogeneous platforms and load balancing is proposed. The proposed model is implemented as an extension of DASH, a C++ template library that realizes the PGAS concept via operator overloading, on multi-GPU systems. Experimental results suggest promising programmability and performance of the model and its implementation.