Variation-Aware Task Allocation and Scheduling for Improving Reliability of Real-Time MPSoCs

Junlong Zhou1, Tongquan Wei2, Mingsong Chen2, X. Sharon Hu3, Yue Ma3, Gongxuan Zhang1 and Jianming Yan2
1Nanjing University of Science and Technology, Nanjing, China
2East China Normal University, Shanghai , China
3University of Notre Dame, Notre Dame, IN, USA

ABSTRACT


Both soft‐error reliability (SER) due to transient faults and lifetime reliability (LTR) due to permanent faults are key concerns in real‐time MPSoCs. Existing works have investigated related problems, however, most of them only focus on one of the two reliability concerns. A few efforts do consider both types of reliability together, but ignore the impacts of hardware‐ and application‐level variations on reliability, thus are not applicable to state‐of‐the‐art MPSoCs under variations. In this paper, we focus on increasing SER without sacrificing LTR since transient faults occur much more frequently than permanent faults. Specifically, we propose a novel task allocation and scheduling scheme to maximize SER while satisfying a LTR constraint for soft real‐time MPSoCs. Considering that SER is the objective while LTR is a constraint in our problem, and LTR is highly related to core temperature profiles, we dedicate to investigating the effects of variations in core soft-error rate, task vulnerability to soft errors, and task execution time on SER. To the best of our knowledge, our work is the first attempt that jointly handles the two reliability issues as well as taking into account the effects of variations on reliability. Experimental results show that our scheme improves the SER by up to 66% as compared to a number of representative existing approaches while meeting the same LTR constraint.

Keywords: Soft‐error reliability, Lifetime reliability, Variations, Real‐time MPSoC systems.



Full Text (PDF)