Throughput Optimization and Resource Allocation on GPUs under Multi‐Application Execution

Srinivasa Reddy Punyala, Theodoros Marinakis, Arash Komaee and Iraklis Anagnostopoulos
Southern Illinois University, Carbondale, U.S.A.

ABSTRACT


Platform heterogeneity prevails as a solution to the throughput and computational challenges imposed by parallel applications and technology scaling. Specifically, Graphics Processing Units (GPUs) are based on the Single Instruction Multiple Thread (SIMT) paradigm and they can offer tremendous speedup for parallel applications. However, GPUs were designed to execute a single application at a time. In case of simultaneous multi‐application execution, due to the GPUs' massive multithreading paradigm, applications compete against each other using destructively the shared resources (caches and memory controllers) resulting in significant throughput degradation. In this paper, a methodology for minimizing interference in shared resources and provide efficient concurrent execution of multiple applications on GPUs is presented. Particularly, the proposed methodology (i) performs application classification; (ii) analyzes the per‐class interference; (iii) finds the best matching between classes; and (iv) employs an efficient resource allocation. Experimental results showed that the proposed approach increases the throughput of the system for two concurrent applications by an average of 36% compared to the default execution and 10% compared to an exahustive profile‐based optimization technique.



Full Text (PDF)