11.4 Learning Gets Smarter

Printer-friendly version PDF version

Date: Thursday, March 28, 2019
Time: 14:00 - 15:30
Location / Room: Room 4

Chair:
Yuanqing Cheng, Beihang University, CN, Contact Yuanqing Cheng

Co-Chair:
Mariagrazia Graziano, Politecnico di Torino, IT, Contact Mariagrazia Graziano

Come and learn how emerging technologies enable deep learning and beyond for a wide range of applications: from industry speech recognition in industrial cloud computing to drones at the edge.

TimeLabelPresentation Title
Authors
14:0011.4.1NEUADC: NEURAL NETWORK-INSPIRED RRAM-BASED SYNTHESIZABLE ANALOG-TO-DIGITAL CONVERSION WITH RECONFIGURABLE QUANTIZATION SUPPORT
Speaker:
Xuan Zhang, WASHINGTON UNIVERSITY ST LOUIS, US
Authors:
Weidong Cao, Xin He, Ayan Chakrabarti and Xuan Zhang, Washington University, US
Abstract
Traditional analog-to-digital converters (ADCs) employ dedicated analog and mixed-signal (AMS) circuits and require time-consuming manual design process. They also exhibit limited reconfigurability and are unable to support diverse quantization schemes using the same circuitry. In this paper, we propose NeuADC --- an automated design approach to synthesizing an analog-to-digital (A/D) interface that can approximate the desired quantization function using a neural network (NN) with a single hidden layer. Our design leverages the mixed-signal resistive random-access memory (RRAM) crossbar architecture in a novel dual-path configuration to realize basic NN operations at the circuit level and exploits smooth bit-encoding scheme to improve the training accuracy. Results obtained from SPICE simulations based on 130nm technology suggest that not only can NeuADC deliver promising performance compared to the state-of-art ADC designs across comprehensive design metrics, but also it can intrinsically support multiple reconfigurable quantization schemes using the same hardware substrate, paving the ways for future adaptable application-driven signal conversion. The robustness of NeuADC's quantization quality under moderate RRAM resistance precision is also evaluated using SPICE simulations.
14:3011.4.2HOLYLIGHT: A NANOPHOTONIC ACCELERATOR FOR DEEP LEARNING IN DATA CENTERS
Speaker:
Weichen Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore, CN
Authors:
Weichen Liu1, Wenyang Liu2, Yichen Ye3, Qian Lou4, Yiyuan Xie3 and Lei Jiang5
1Nanyang Technological University, SG; 2College of Computer Science, Chongqing University, CN; 3College of Electronics and Information Engineering, Southwest University, CN; 4Department of Intelligent Systems Engineering, Indiana University, US; 5Indiana University Bloomington, US
Abstract
Convolutional Neural Networks (CNNs) are widely adopted in object recognition, speech processing and machine translation, due to their extremely high inference accuracy. However, it is challenging to compute massive computationally expensive convolutions of deep CNNs on traditional CPUs and GPUs. Emerging Nanophotonic technology has been employed for on-chip data communication, because of its CMOS compatibility, high bandwidth and low power consumption. In this paper, we propose a nanophotonic accelerator, HolyLight, to boost the CNN inference throughput in datacenters. Instead of an all-photonic design, HolyLight performs convolutions by photonic integrated circuits, and process the other operations in CNNs by CMOS circuits for high inference accuracy. We first build HolyLight-M by microdisk-based matrix-vector multipliers. We find analog-to-digital converters (ADCs) seriously limit its inference throughput per Watt. We further use microdisk-based adders and shifters to architect HolyLight-A without ADCs. Compared to the state-of-the-art ReRAM-based accelerator, HolyLight-A improves the CNN inference throughput per Watt by 13x with trivial accuracy degradation.
15:0011.4.3TRANSFER AND ONLINE REINFORCEMENT LEARNING IN STT-MRAM BASED EMBEDDED SYSTEMS FOR AUTONOMOUS DRONES
Speaker:
Insik Yoon, Georgia Institute of Technology, US
Authors:
Insik Yoon1, Aqeel Anwar1, Titash Rakshit2 and Arijit Raychowdhury1
1Georgia Institute of Technology, US; 2Samsung, US
Abstract
In this paper we present an algorithm-hardware co-design for camera-based autonomous flight in small drones. We show that the large write-latency and write-energy for non-volatile memory (NVM) based embedded systems makes them unsuitable for real-time reinforcement learning (RL). We address this by performing transfer learning (TL) on meta-environments and RL on the last few layers of a deep convolutional network, While the NVM stores the meta-model (from TL), an on-die SRAM stores the weights of the last few layers. Thus all the real-time updates via RL are carried out on the SRAM arrays. This provides us with a practical platform with comparable performance as end-to-end RL and 83.4% lower energy per image frame.
15:1511.4.4AIX: A HIGH PERFORMANCE AND ENERGY EFFICIENT INFERENCE ACCELERATOR ON FPGA FOR A DNN-BASED COMMERCIAL SPEECH RECOGNITION
Speaker:
Minwook Ahn, SK Telecom, KR
Authors:
Minwook Ahn, Seok Joong Hwang, Wonsub Kim, Seungrok Jung, Yeonbok Lee, Mookyoung Chung, Woohyung Lim and Youngjoon Kim, SK Telecom, KR
Abstract
Automatic speech recognition (ASR) is crucial in virtual personal assistant (VPA) services such as Apple Siri, Amazon Alexa, Google Now and SKT NUGU. Recently, ASR has been showing a remarkable advance in accuracy by applying deep learning. However, with the explosive increase of the user utterances and growing complexities in ASR, the demands for the custom accelerators in datacenters are highly increasing in order to process them in real time with low power consumption. This paper evaluates a custom inference accelerator for ASR enhanced by a deep neural network, called AIX (Artificial Intelligence aXellerator). AIX is developed on a Xilinx FPGA and deployed to SKT NUGU since 2018. Owing to the full exploitation of DSP slices and memory bandwidth provided by FPGA, AIX outperforms the cutting-edge CPUs by 10.2 times and even a state-of-the-art GPU by 20.1 times with real time workloads of ASR in performance and power consumption wise. This improvement achieves faster response time in ASR, and in turn reduces the number of required machines in datacenters to a third.
15:30End of session
Coffee Break in Exhibition Area



Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

Wednesday, March 27, 2019

Thursday, March 28, 2019