5.3 Hot Topic Session: I'm Gonna Make an Approximation IoT Can't Refuse - Approximate Computing for Improving Power Efficiency of IoT and HPC

Time	Label	Presentation Title Authors
08:30	5.3.1	INTRODUCTION Author: Christian Enz, EPFL, CH
08:45	5.3.2	PUSHING THE LIMITS OF VOLTAGE OVER-SCALING FOR ERROR-RESILIENT APPLICATIONS Speaker: Olivier Sentieys, INRIA, FR Authors: Rengerajan Ragavan¹, Benjamin Barrois², Cedric Killian¹ and Olivier Sentieys¹ ¹INRIA, FR; ²University of Rennes - INRIA, FR Abstract Voltage scaling has been used as a prominent technique to improve energy efficiency in digital systems, as reduction in the supply voltage effects in quadratic reduction in energy consumption of the system. The energy efficiency is achieved at the cost of timing errors in the system, that are corrected through additional error detection and correction circuits. In this paper we are proposing voltage over-scaling based approximate operators for applications that can tolerate errors. We characterize the basic arithmetic operators using different operating triads (combination of supply voltage, back biasing scheme and clock frequency) to generate models for approximate operators. Error-resilient applications can be mapped with the generated approximate operator models to achieve optimum trade-off between energy efficiency and error margin. Based on the dynamic speculation technique, best possible operating triad is chosen at runtime based on the user definable error tolerance margin of the application. In our experiments in 28nm FDSOI, we achieve maximum energy efficiency of 89% for basic operators like 8-bit and 16-bit adders at the cost of 20% Bit Error Rate (ratio of faulty bits over total bits) by operating them in near-threshold regime. Download Paper (PDF; Only available from the DATE venue WiFi)
09:00	5.3.3	COMBINING STRUCTURAL AND TIMING ERRORS IN OVERCLOCKED INEXACT SPECULATIVE ADDERS Speaker: Vincent Camus, EPFL, CH Authors: Xun Jiao¹, Vincent Camus², Mattia Cacciotti², Yu Jiang³, Christian Enz² and Rajesh Gupta¹ ¹UC San Diego, US; ²EPFL, CH; ³Tsinghua University, CN Abstract Worst-case design is used in IoT devices and high performance data centers to ensure reliability by adding extra safety margin, leading to a power efficiency loss. Recently, approximate computing has been proposed to trade off accuracy for efficiency. In this paper, we use an inexact speculative adder, which redesigns the adder architecture by shortening the critical path to save power consumption. Its overdesign introduces structural errors due to carry speculation. On the other hand, overclocking is used to reduce conservative timing guardbands but could introduce timing errors. In this paper, we apply a supervised learning model to overclocked inexact speculative adders to predict timing errors at bit level. We analyze these two types of errors and examine the joint effects of them. Download Paper (PDF; Only available from the DATE venue WiFi)
09:15	5.3.4	DVAFS: TRADING COMPUTATIONAL ACCURACY FOR ENERGY THROUGH DYNAMIC-VOLTAGE-ACCURACY-FREQUENCY-SCALING Speaker: Bert Moons, Katholieke Universiteit Leuven, BE Authors: Bert Moons, Roel Uytterhoeven, Wim Dehaene and Marian Verhelst, Katholieke Universiteit Leuven, BE Abstract Several applications in machine learning and machine-to-human interactions tolerate small deviations in their computations. Digital systems can exploit this fault-tolerance to increase their energy-efficiency, which is crucial in embedded applications. Hence, this paper introduces a new means of Approximate Computing: Dynamic-Voltage-Accuracy-Frequency-Scaling (DVAFS), a circuit-level technique enabling a dynamic trade-off of energy versus computational accuracy that outperforms other Approximate Computing techniques. The usage and applicability of DVAFS is illustrated in the context of Deep Neural Networks, the current state-of-the-art in advanced recognition. These networks are typically executed on CPU's or GPU's due to their high computational complexity, making their deployment on battery-constrained platforms only possible through wireless connections with the cloud. This work shows how deep learning can be brought to IoT devices by running every layer of the network at its optimal computational accuracy. Finally, we demonstrate a DVAFS processor for Convolutional Neural Networks, achieving efficiencies of multiple TOPS/W. Download Paper (PDF; Only available from the DATE venue WiFi)
09:30	5.3.5	EXPLOITING COMPUTATION SKIP TO REDUCE ENERGY CONSUMPTION BY APPROXIMATE COMPUTING, AN HEVC ENCODER CASE STUDY Speaker: Daniel Menard, INSA Rennes, FR Authors: Alexandre Mercat¹, Justine Bonnot¹, Maxime Pelcat², Wassim Hamidouche¹ and Daniel Menard¹ ¹INSA Rennes, FR; ²IETR-INSA, FR Abstract Approximate computing paradigm provides methods to optimize algorithms with considering both computational accuracy and complexity. This paradigm can be exploited at different levels of abstraction, from technological to application levels. Approximate computing at algorithm level aims at reducing computational complexity by approximating or skipping blocks function of the computation. Numerous applications in the signal and image processing domain integrate algorithms based on discrete optimization techniques. These techniques minimize a cost function by exploring the search space. In this paper, a new approach is proposed to exploit the computation-skipping approximate computing concept by using the SSSR technique. SSSR enables early selection of the best candidate configurations to reduce the search space. An efficient SSSR technique adjusts configuration selectivity to reduce execution complexity while selecting the functions most suitable to skip. The HEVC encoder in AI profile is used as a case study to illustrate the benefits of SSSR. In this application, two functions use discrete optimization to explore different solutions and select the one leading to the minimal cost in terms of bitrate/quality and computational energy: coding-tree partitioning and intra-mode prediction. By applying SSSR to this use case, energy reductions from 20% to 70% are explored through Pareto in Rate-Energy space. Download Paper (PDF; Only available from the DATE venue WiFi)
09:45	5.3.6	LOCATION DETECTION FOR NAVIGATION USING IMUS WITH A MAP THROUGH COARSE-GRAINED MACHINE LEARNING Speaker: Chen Luo, Rice University, US Authors: J. Jose Gonzales E.¹, Chen Luo¹, Anshumali Shrivastava¹, Krishna Palem¹, Moon Yongshik², Soonhyun Noh², Daedong Park³ and Seongsoo Hong² ¹Rice University, US; ²Seoul National University, KR; ³Dept. of Electrical and Computer Engineering, Seoul National University, KR Abstract Location detection or localization supporting navigation has assumed significant importance in the recent past. In particular, techniques that exploit cheap inertial measurement units (IMU), the gyroscope and the accelerometer, have garnered attention, especially in an embedded computing context. However, these sensors measurements are quite unreliable, and it is widely believed that these sensors by themselves are too noisy for localization with acceptable accuracy. Consequently, several lines of work embody other costly alternatives to lower the impact of accumulated errors associated with IMU based approaches, invariably leading to very high energy costs resulting in lowered battery life. In this paper, we show that IMUs are sufficient by themselves if we augment them with known structural or geographical information about the physical area being explored by the user. By using the {em map} of the region being explored and the fact that humans typically walk in a structured manner, our approach sidesteps the challenges created by noise and concomitant accumulation of error. Specifically, we show that a simple coarse-grained machine learning approach mitigates the effect of the noisy perturbations in the information from our IMUs, provided we have accurate maps. Throughout, we rely on the principle of inexactness in an overarching manner and relax the need for absolute accuracy in return for significant lowering of resource (energy) costs. Notably, our approach is completely independent of any external guidance from sources including GPS, Bluetooth or WiFi support, and is this privacy preserving. Specifically, we show through experimental results that by relying on gyroscope and accelerometer data alone, we can correctly identify the path-segment where the user is walking/running on a known map, as well as the position within the path with an accuracy of 4.3 meters on the average using 0.44 Joules. This is a factor of 27X cheaper in energy lower than the ``gold standard'' that one could consider based on GPS support which, surprisingly, has an associated error of 8.7 meters on the average. Download Paper (PDF; Only available from the DATE venue WiFi)
10:00		End of session Coffee Break in Exhibition Area On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area. Tuesday, March 28, 2017 Coffee Break 10:30 - 11:30 Coffee Break 16:00 - 17:00 Wednesday, March 29, 2017 Coffee Break 10:00 - 11:00 Coffee Break 16:00 - 17:00 Thursday, March 30, 2017 Coffee Break 10:00 - 11:00 Coffee Break 15:30 - 16:00

Time

Label

Presentation Title
Authors

08:30

5.3.1

INTRODUCTION
Author:
Christian Enz, EPFL, CH

08:45

5.3.2

PUSHING THE LIMITS OF VOLTAGE OVER-SCALING FOR ERROR-RESILIENT APPLICATIONS
Speaker:
Olivier Sentieys, INRIA, FR
Authors:
Rengerajan Ragavan¹, Benjamin Barrois², Cedric Killian¹ and Olivier Sentieys¹
¹INRIA, FR; ²University of Rennes - INRIA, FR
Abstract
Voltage scaling has been used as a prominent technique to improve energy efficiency in digital systems, as reduction in the supply voltage effects in quadratic reduction in energy consumption of the system. The energy efficiency is achieved at the cost of timing errors in the system, that are corrected through additional error detection and correction circuits. In this paper we are proposing voltage over-scaling based approximate operators for applications that can tolerate errors. We characterize the basic arithmetic operators using different operating triads (combination of supply voltage, back biasing scheme and clock frequency) to generate models for approximate operators. Error-resilient applications can be mapped with the generated approximate operator models to achieve optimum trade-off between energy efficiency and error margin. Based on the dynamic speculation technique, best possible operating triad is chosen at runtime based on the user definable error tolerance margin of the application. In our experiments in 28nm FDSOI, we achieve maximum energy efficiency of 89% for basic operators like 8-bit and 16-bit adders at the cost of 20% Bit Error Rate (ratio of faulty bits over total bits) by operating them in near-threshold regime.
Download Paper (PDF; Only available from the DATE venue WiFi)

09:00

5.3.3

COMBINING STRUCTURAL AND TIMING ERRORS IN OVERCLOCKED INEXACT SPECULATIVE ADDERS
Speaker:
Vincent Camus, EPFL, CH
Authors:
Xun Jiao¹, Vincent Camus², Mattia Cacciotti², Yu Jiang³, Christian Enz² and Rajesh Gupta¹
¹UC San Diego, US; ²EPFL, CH; ³Tsinghua University, CN
Abstract
Worst-case design is used in IoT devices and high performance data centers to ensure reliability by adding extra safety margin, leading to a power efficiency loss. Recently, approximate computing has been proposed to trade off accuracy for efficiency. In this paper, we use an inexact speculative adder, which redesigns the adder architecture by shortening the critical path to save power consumption. Its overdesign introduces structural errors due to carry speculation. On the other hand, overclocking is used to reduce conservative timing guardbands but could introduce timing errors. In this paper, we apply a supervised learning model to overclocked inexact speculative adders to predict timing errors at bit level. We analyze these two types of errors and examine the joint effects of them.
Download Paper (PDF; Only available from the DATE venue WiFi)

09:15

5.3.4

DVAFS: TRADING COMPUTATIONAL ACCURACY FOR ENERGY THROUGH DYNAMIC-VOLTAGE-ACCURACY-FREQUENCY-SCALING
Speaker:
Bert Moons, Katholieke Universiteit Leuven, BE
Authors:
Bert Moons, Roel Uytterhoeven, Wim Dehaene and Marian Verhelst, Katholieke Universiteit Leuven, BE
Abstract
Several applications in machine learning and machine-to-human interactions tolerate small deviations in their computations. Digital systems can exploit this fault-tolerance to increase their energy-efficiency, which is crucial in embedded applications. Hence, this paper introduces a new means of Approximate Computing: Dynamic-Voltage-Accuracy-Frequency-Scaling (DVAFS), a circuit-level technique enabling a dynamic trade-off of energy versus computational accuracy that outperforms other Approximate Computing techniques. The usage and applicability of DVAFS is illustrated in the context of Deep Neural Networks, the current state-of-the-art in advanced recognition. These networks are typically executed on CPU's or GPU's due to their high computational complexity, making their deployment on battery-constrained platforms only possible through wireless connections with the cloud. This work shows how deep learning can be brought to IoT devices by running every layer of the network at its optimal computational accuracy. Finally, we demonstrate a DVAFS processor for Convolutional Neural Networks, achieving efficiencies of multiple TOPS/W.
Download Paper (PDF; Only available from the DATE venue WiFi)

09:30

5.3.5

EXPLOITING COMPUTATION SKIP TO REDUCE ENERGY CONSUMPTION BY APPROXIMATE COMPUTING, AN HEVC ENCODER CASE STUDY
Speaker:
Daniel Menard, INSA Rennes, FR
Authors:
Alexandre Mercat¹, Justine Bonnot¹, Maxime Pelcat², Wassim Hamidouche¹ and Daniel Menard¹
¹INSA Rennes, FR; ²IETR-INSA, FR
Abstract
Approximate computing paradigm provides methods to optimize algorithms with considering both computational accuracy and complexity. This paradigm can be exploited at different levels of abstraction, from technological to application levels. Approximate computing at algorithm level aims at reducing computational complexity by approximating or skipping blocks function of the computation. Numerous applications in the signal and image processing domain integrate algorithms based on discrete optimization techniques. These techniques minimize a cost function by exploring the search space. In this paper, a new approach is proposed to exploit the computation-skipping approximate computing concept by using the SSSR technique. SSSR enables early selection of the best candidate configurations to reduce the search space. An efficient SSSR technique adjusts configuration selectivity to reduce execution complexity while selecting the functions most suitable to skip. The HEVC encoder in AI profile is used as a case study to illustrate the benefits of SSSR. In this application, two functions use discrete optimization to explore different solutions and select the one leading to the minimal cost in terms of bitrate/quality and computational energy: coding-tree partitioning and intra-mode prediction. By applying SSSR to this use case, energy reductions from 20% to 70% are explored through Pareto in Rate-Energy space.
Download Paper (PDF; Only available from the DATE venue WiFi)

09:45

5.3.6

LOCATION DETECTION FOR NAVIGATION USING IMUS WITH A MAP THROUGH COARSE-GRAINED MACHINE LEARNING
Speaker:
Chen Luo, Rice University, US
Authors:
J. Jose Gonzales E.¹, Chen Luo¹, Anshumali Shrivastava¹, Krishna Palem¹, Moon Yongshik², Soonhyun Noh², Daedong Park³ and Seongsoo Hong²
¹Rice University, US; ²Seoul National University, KR; ³Dept. of Electrical and Computer Engineering, Seoul National University, KR
Abstract
Location detection or localization supporting navigation has assumed significant importance in the recent past. In particular, techniques that exploit cheap inertial measurement units (IMU), the gyroscope and the accelerometer, have garnered attention, especially in an embedded computing context. However, these sensors measurements are quite unreliable, and it is widely believed that these sensors by themselves are too noisy for localization with acceptable accuracy. Consequently, several lines of work embody other costly alternatives to lower the impact of accumulated errors associated with IMU based approaches, invariably leading to very high energy costs resulting in lowered battery life. In this paper, we show that IMUs are sufficient by themselves if we augment them with known structural or geographical information about the physical area being explored by the user. By using the {em map} of the region being explored and the fact that humans typically walk in a structured manner, our approach sidesteps the challenges created by noise and concomitant accumulation of error. Specifically, we show that a simple coarse-grained machine learning approach mitigates the effect of the noisy perturbations in the information from our IMUs, provided we have accurate maps. Throughout, we rely on the principle of inexactness in an overarching manner and relax the need for absolute accuracy in return for significant lowering of resource (energy) costs. Notably, our approach is completely independent of any external guidance from sources including GPS, Bluetooth or WiFi support, and is this privacy preserving. Specifically, we show through experimental results that by relying on gyroscope and accelerometer data alone, we can correctly identify the path-segment where the user is walking/running on a known map, as well as the position within the path with an accuracy of 4.3 meters on the average using 0.44 Joules. This is a factor of 27X cheaper in energy lower than the ``gold standard'' that one could consider based on GPS support which, surprisingly, has an associated error of 8.7 meters on the average.
Download Paper (PDF; Only available from the DATE venue WiFi)

10:00

End of session
Coffee Break in Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Tuesday, March 28, 2017