SparseNN: An Energy‐Efficient Neural Network Accelerator Exploiting Input and Output Sparsity

Jingyang Zhua, Jingbo Jiangb, Xizi Chenc and Chi‐Ying Tsuid
Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong
ajzhuak@connect.ust.hk
bjingbo.jiang@connect.ust.hk
cxchenbn@connect.ust.hk
deetsui@ust.hk

ABSTRACT


Contemporary Deep Neural Network (DNN) contains millions of synaptic connections with tens to hundreds of layers. The large computational complexity poses a challenge to the hardware design. In this work, we leverage the intrinsic activation sparsity of DNN to substantially reduce the execution cycles and the energy consumption. An end‐to‐end training algorithm is proposed to develop a lightweight (less than 5% overhead) run‐time predictor for the output activation sparsity on the fly. Furthermore, an energy‐efficient hardware architecture, SparseNN, is proposed to exploit both the input and output sparsity. SparseNN is a scalable architecture with distributed memories and processing elements connected through a dedicated on‐chip network. Compared with the state‐of‐the‐art accelerators which only exploit the input sparsity, SparseNN can achieve a 10%‐70% improvement in throughput and a power reduction of around 50%.



Full Text (PDF)