2.7 Analysis and optimization techniques for neural networks

Printer-friendly version PDF version

Date: Tuesday, March 26, 2019
Time: 11:30 - 13:00
Location / Room: Room 7

Chair:
Sai Pudukot, Georg Mason University, US, Contact Sai Pudukot

Co-Chair:
Mohamed Sabry, NTU, SG, Contact Mohamed M. Sabry

This session presents three papers with new approaches to characterize the neural network behavior on edge devices in order to optimize their performance and energy consumption according to the target application.

TimeLabelPresentation Title
Authors
11:302.7.1LOW-COMPLEXITY DYNAMIC CHANNEL SCALING OF NOISE-RESILIENT CNN FOR INTELLIGENT EDGE DEVICES
Speaker:
Younghoon Byun, Pohang University of Science and Technology (POSTECH), KR
Authors:
Younghoon Byun, Minho Ha, Jeonghun Kim, Sunggu Lee and Youngjoo Lee, Pohang University of Science and Technology (POSTECH), KR
Abstract
In this paper, we present a novel channel scaling scheme for convolutional neural networks (CNNs), which can improve the recognition accuracy for the practical distorted images without increasing the network complexity. During the training phase, the proposed work first prepares multiple filters under the same CNN architecture by taking account of different noise models and strengths. We then newly introduce an FFT-based noise classifier, which determines the noise property in the received input image by calculating the partial sum of the frequency-domain values. Based on the detected noise class, we dynamically change the filters of each CNN layer to provide the dedicated recognition. Hence, the proposed noise-resilient CNN system always loads the same number of filter parameters to the original network while achieving the attractive accuracy even for the noisy inputs. To apply the proposed CNN for resource-constrained embedded edge devices accepting the distorted raw images, furthermore, we propose a channel scaling technique to reduce the number of active filter parameters if the input data is relatively clean. Experimental results show that the proposed dynamic channel scaling reduces the computational complexity as well as the energy consumption, still providing the acceptable accuracy for intelligent edge devices.
12:002.7.2DATA LOCALITY OPTIMIZATION OF DEPTHWISE SEPARABLE CONVOLUTIONS FOR CNN INFERENCE ACCELERATORS
Speaker:
Hao-Ning Wu, National Tsing Hua University, TW
Authors:
Hao-Ning Wu and Chih-Tsun Huang, National Tsing Hua University, TW
Abstract
This paper presents a novel framework to maximize the data reusability in the depthwise separable convolutional layers with the Scan execution order of the tiled matrix multiplications. In addition, the fusion scheme across layers is proposed to minimize the data transfer of the intermediate activations, improving both the latency and energy consumption from the external memory accesses. The experimental results are validated against DRAMSim2 for the accurate timing and energy estimation. With a 64K-entry on-chip buffer, our approach can achieve the DRAM energy reduction of 67% on MobileNet V2.
12:302.7.3A BINARY LEARNING FRAMEWORK FOR HYPERDIMENSIONAL COMPUTING
Speaker:
Mohsen Imani, University of California, San Diego, US
Authors:
Mohsen Imani1, John Messerly1, Fan Wu2, Wang Pi3 and Tajana Rosing1
1University of California San Diego, US; 2University of California Riverside, US; 3Peking University, CN
Abstract
Brain-inspired Hyperdimensional (HD) computing is a computing paradigm emulating a neuron's activity in high-dimensional space. In practice, HD first encodes all data points to high-dimensional vectors, called hypervectors, and then performs the classification task in an efficient way using a well-defined set of operations. In order to provide acceptable classification accuracy, the current HD computing algorithms need to map data points to hypervectors with non-binary elements. However, working with non-binary vectors significantly increases the HD computation cost and the amount of memory requirement for both training and inference. This makes HD computing less desirable for embedded devices which often have limited resources and battery. In this paper, we propose BinHD, a novel binarization framework which enables HD computing to be trained and tested using binarized hypervectors. BinHD encodes data points to binarized hypervectors and provides a framework which enables HD to perform the training task with significantly low resources and memory footprint. In inference, BinHD binarizes the model and simplifies the costly Cosine similarity used in existing HD computing algorithms to a hardware-friendly Hamming distance metric. In addition, for the first time, BinHD introduces the concept of learning rate in HD computing which gives an extra knob to the HD to control the training efficiency and accuracy. We accordingly design a digital hardware to accelerate BinHD computation. Our evaluations on four practical classification applications show that BinHD in training (inference) can achieve 12.4× and 6.3× (13.8× and 9.9×) energy efficiency and speedup as compared to the state-of-the-art HD computing algorithm while providing the similar classification accuracy.
13:00IP1-11, 247TYPECNN: CNN DEVELOPMENT FRAMEWORK WITH FLEXIBLE DATA TYPES
Speaker:
Lukas Sekanina, Brno University of Technology, CZ
Authors:
Petr Rek and Lukas Sekanina, Brno University of Technology, CZ
Abstract
The rapid progress in artificial intelligence technologies based on deep and convolutional neural networks (CNN) has led to an enormous interest in efficient implementations of neural networks in embedded devices and hardware. We present a new software framework for the development of (approximate) convolutional neural networks in which the user can define and use various data types for forward (inference) procedure, backward (training) procedure and weights. Moreover, non-standard arithmetic operations such as approximate multipliers can easily be integrated into the CNN under design. This flexibility enables to analyze the impact of chosen data types and non-standard arithmetic operations on CNN training and inference efficiency. The framework was implemented in C++ and evaluated using several case studies.
13:01IP1-12, 963GUARANTEED COMPRESSION RATE FOR ACTIVATIONS IN CNNS USING A FREQUENCY PRUNING APPROACH
Speaker:
Sebatian Vogel, Robert Bosch GmbH, DE
Authors:
Sebastian Vogel1, Christoph Schorn1, Andre Guntoro1 and Gerd Ascheid2
1Robert Bosch GmbH, DE; 2RWTH Aachen University, DE
Abstract
Convolutional Neural Networks have become state of the art for many computer vision tasks. However, the size of Neural Networks prevents their application in resource constrained systems. In this work, we present a lossy compression technique for intermediate results of Convolutional Neural Networks. The proposed method offers guaranteed compression rates and additionally adapts to performance requirements. Our experiments with networks for classification and semantic segmentation show, that our method outperforms state-of-the-art compression techniques used in CNN accelerators.
13:02IP1-13, 290RUNTIME MONITORING NEURON ACTIVATION PATTERNS
Speaker:
Chih-Hong Cheng, fortiss, DE
Authors:
Chih-Hong Cheng1, Georg Nührenberg1 and Hirotoshi Yasuoka2
1fortiss - Landesforschungsinstitut des Freistaats Bayern, DE; 2DENSO Corporation, JP
Abstract
For using neural networks in safety critical domains such as automated driving, it is important to know if a decision made by a neural network is supported by prior similarities in training. We propose runtime neuron activation pattern monitoring - after the standard training process, one creates a monitor by feeding the training data to the network again in order to store the neuron activation patterns in abstract form. In operation, a classification decision over an input is further supplemented by examining if a pattern similar (measured by Hamming distance) to the generated pattern is contained in the monitor. If the monitor does not contain any pattern similar to the generated pattern, it raises a warning that the decision is not based on the training data. Our experiments show that, by adjusting the similarity-threshold for activation patterns, the monitors can report a significant portion of misclassfications to be not supported by training with a small false-positive rate, when evaluated on a test set.
13:00End of session
Lunch Break in Lunch Area



Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the ""Lunch Area"" to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

  • Coffee Break 10:30 - 11:30
  • Lunch Break 13:00 - 14:30
  • Awards Presentation and Keynote Lecture in ""TBD"" 13:50 - 14:20
  • Coffee Break 16:00 - 17:00

Wednesday, March 27, 2019

  • Coffee Break 10:00 - 11:00
  • Lunch Break 12:30 - 14:30
  • Awards Presentation and Keynote Lecture in ""TBD"" 13:30 - 14:20
  • Coffee Break 16:00 - 17:00

Thursday, March 28, 2019

  • Coffee Break 10:00 - 11:00
  • Lunch Break 12:30 - 14:00
  • Keynote Lecture in ""TBD"" 13:20 - 13:50
  • Coffee Break 15:30 - 16:00