Date: Tuesday, March 26, 2019
Time: 11:30  13:00
Location / Room: Room 6
Chair:
Martin Trefzer, University of York, GB, Contact Martin Albrecht Trefzer
CoChair:
Lukas Sekanina, Brno University of Technology, CZ, Contact Lukas Sekanina
Achieving computational and resourceefficiency is often promised by emerging technologies. This session addresses various aspects of efficiency in the context of quantum and approximate computing. Simulation of quantum computations is very computationally expensive. The first paper shows how a smart utilization of decision diagrams can significantly accelerate this process. Approximate computing is often advocated as an approach enabling to build more resourceefficient computing systems. The second paper of this session deals with an automated circuit approximation method capable of exploiting data distributions observable in the target application. The third paper presents a new application for approximate circuits in the area of wireless sensor networks. Finally, an efficient approximate random dropout technique for training acceleration of neural networks running on GPU is proposed in the fourth paper.
Time  Label  Presentation Title Authors 

11:30  2.6.1  MATRIXVECTOR VS. MATRIXMATRIX MULTIPLICATION: POTENTIAL IN DDBASED SIMULATION OF QUANTUM COMPUTATIONS Speaker: Alwin Zulehner, Johannes Kepler University Linz, AT Authors: Alwin Zulehner and Robert Wille, Johannes Kepler University Linz, AT Abstract The simulation of quantum computations basically boils down to the multiplication of vectors (describing the respective quantum state) and matrices (describing the respective quantum operations). However, since those matrices/vectors are exponential in size, most of the existing solutions (relying on arrays for their representation) are either limited to rather small quantum systems or require substantial hardware resources. To overcome these shortcomings, solutions based on decision diagrams (DDbased simulation) have been proposed recently. They exploit redundancies in quantum states as well as matrices and, by this, allow for a compact representation and manipulation. This offers further (unexpected) potential. In fact, simulation has been conducted thus far by applying one operation (i.e. one matrixvector multiplication) after another. Besides that, there is the possibility to combine several operations (requiring a matrixmatrix multiplication) before applying them to a vector. But since, from a theoretical perspective, matrixvector multiplication is significantly cheaper than matrixmatrix multiplication, the potential of this direction was rather limited thus far. In this work, we show that this changes when decision diagrams are employed. In fact, their more compact representation frequently makes matrixmatrix multiplication more beneficial—leading to substantial improvements by exploiting the combination of operations. Experimental results confirm the proposed strategies for combining operations lead to speedups of several factors or—when additionally exploiting further knowledge about the considered instance—even of several orders of magnitudes. 
12:00  2.6.2  AUTOMATED CIRCUIT APPROXIMATION METHOD DRIVEN BY DATA DISTRIBUTION Speaker: Lukas Sekanina, Brno University of Technology, CZ Authors: Zdenek Vasicek, Vojtech Mrazek and Lukas Sekanina, Brno University of Technology, CZ Abstract We propose an applicationtailored datadriven fully automated method for functional approximation of combinational circuits. We demonstrate how an applicationlevel error metric such as the classification accuracy can be translated to a componentlevel error metric needed for an efficient and fast search in the space of approximate lowlevel components that are used in the application. This is possible by employing a weighted mean error distance (WMED) metric for steering the circuit approximation process which is conducted by means of genetic programming. WMED introduces a set of weights (calculated from the data distribution measured on a selected signal in a given application) determining the importance of each input vector for the approximation process. The method is evaluated using synthetic benchmarks and applicationspecific approximate MAC (multiplyandaccumulate) units that are designed to provide the best tradeoffs between the classification accuracy and power consumption of two image classifiers based on neural networks. 
12:30  2.6.3  TRADING DIGITAL ACCURACY FOR POWER IN AN RSSI COMPUTATION OF A SENSOR NETWORK TRANSCEIVER Speaker: Paul Detterer, Eindhoven University of Technology, NL Authors: Paul Detterer, Cumhur Erdin, Majid Nabi, Jose Pineda de Gyvez, Twan Basten and Hailong Jiao, Eindhoven University of Technology, NL Abstract Emerging Wireless Sensor Network (WSN) applications require more energy efficiency in the wireless transceivers, though the conventional energy efficient design techniques are reaching their limits. To handle the rigid power and energy constraints in the Digital BaseBand (DBB) of WSNs, we introduce approximate computing in DBB processing as a new power reduction method. The Received Signal Strength Indicator (RSSI) computation is a key element in DBB processing. We evaluate the tradeoff in RSSI computation between QualityofService (QoS) and power consumption through circuitlevel approximation. RSSI elements are approximated in such a way that error propagation is minimized. In an industrial 40nm CMOS technology, substantial power savings are achieved with limited accuracy loss, both at circuit level and at network level in a lowpower listening WSN scenario. 
12:45  2.6.4  APPROXIMATE RANDOM DROPOUT FOR DNN TRAINING ACCELERATION IN GPGPU Speaker: Li Jiang, Shanghai Jiao Tong University, CN Authors: Zhuoran Song, Ru Wang, Dongyu Ru, Zhenghao Peng, Hongru Huang, Hai Zhao, Xiaoyao Liang and Li Jiang, Shanghai Jiao Tong University, CN Abstract The training phases of Deep neural network (DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it can be hardly used in the training phase because the training phase involves dense matrixmultiplication using General Purpose Computation on Graphics Processors (GPGPU), which endorse regular and structural data layout. In this paper, we propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and online generated patterns to eliminate the unnecessary computation and data access. We develop a SGDbased Search Algorithm that producing the distribution of dropout patterns to compensate the potential accuracy loss. We prove our approach is statistically equivalent to the previous dropout method. Experiments results on multilayer perceptron (MLP) and long shortterm memory (LSTM) using wellknown benchmarks show that the speedup rate brought by the proposed Approximate Random Dropout ranges from 1.182.16 (1.241.85) when dropout rate is 0.30.7 on MLP (LSTM) with negligible accuracy drop. 
13:00  IP18, 229  ACCURACY AND COMPACTNESS IN DECISION DIAGRAMS FOR QUANTUM COMPUTATION Speaker: Alwin Zulehner, Johannes Kepler University Linz, AT Authors: Alwin Zulehner^{1}, Philipp Niemann^{2}, Rolf Drechsler^{3} and Robert Wille^{1} ^{1}Johannes Kepler University Linz, AT; ^{2}CyberPhysical Systems, DFKI GmbH, DE; ^{3}University of Bremen, DE Abstract Quantum computation is a promising research field since it allows to conduct certain tasks exponentially faster than on conventional machines. As in the conventional domain, decision diagrams are heavily used in different design tasks for quantum computation like synthesis, verification, or simulation. However, unlike decision diagrams for the conventional domain, decision diagrams for quantum computation as of now suffer from a tradeoff between accuracy and compactness that requires parameter finetuning on a casebycase basis. In this work, we—for the first time—describe and evaluate the effects of this tradeoff. Moreover, we propose an alternative approach that utilizes an algebraic representation of the occurring irrational numbers and outline how this can be incorporated in a decision diagram in order to overcome this tradeoff. 
13:01  IP19, 458  ONE METHOD  ALL ERRORMETRICS: A THREESTAGE APPROACH FOR ERRORMETRIC EVALUATION IN APPROXIMATE Speaker: Saman Fröhlich, University of Bremen/DFKI GmbH, DE Authors: Saman Fröhlich^{1}, Daniel Grosse^{2} and Rolf Drechsler^{2} ^{1}University of Bremen/DFKI GmbH, DE; ^{2}University of Bremen, DE Abstract Approximate Computing is a design paradigm that makes use of the error tolerance inherited by many applications, such as machine learning, media processing and data mining. The goal of Approximate Computing is to trade off accuracy for performance in terms of computation time, energy consumption and/or hardware complexity. In the field of circuit design for Approximate Computing, errormetrics are used to express the degree of approximation. Evaluating these errormetrics is a key challenge. Several approaches exist, however, to this day not all relevant metrics can be evaluated with formal methods. Recently, Symbolic Computer Algebra (SCA) has been used to evaluate errormetrics during approximate hardware generation. In this paper, we generalize the idea to use SCA and propose a methodology which is suitable for formal evaluation of all established errormetrics. This approach can be divided into threestages: (i) Determine the remainder of the AC circuit wrt.the specification using SCA, (ii) build an Algebraic Decision Diagram (ADD) to represent the remainder and (iii) evaluate each errormetric by a tailored ADD traversal algorithm. Besides being the first to propose a closed formal method for evaluation of all relevant errormetrics, we are the first to ever propose formal algorithms for the evaluation of the worstcaserelative and the averagecaserelative errormetrics. In the experiments, we apply our algorithms to a large and wellknown benchmark set. 
13:02  IP110, 657  REVERSIBLE PEBBLING GAME FOR QUANTUM MEMORY MANAGEMENT Speaker: Giulia Meuli, EPFL, CH Authors: Giulia Meuli^{1}, Mathias Soeken^{1}, Martin Roetteler^{2}, Nikolaj Bjorner^{2} and Giovanni De Micheli^{1} ^{1}EPFL, CH; ^{2}Microsoft, US Abstract Quantum memory management is becoming a pressing problem, especially given the recent research effort to develop new and more complex quantum algorithms. The only existing automatic method for quantum states cleanup relies on the availability of many extra resources. In this work, we propose an automatic tool for quantum memory management. We show how this problem exactly matches the reversible pebbling game. Based on that, we develop a SATbased algorithm that returns a valid cleanup strategy, taking the limitations of the quantum hardware into account. The developed tool empowers the designer with the flexibility required to explore the tradeoff between memory resources and number of operations. We present two showcases to prove the validity of our approach. First, we apply the algorithm to straightline programs, widely used in cryptographic applications. Second, we perform a comparison with the existing approach, showing an average improvement of 52.77%. 
13:00  End of session Lunch Break in Lunch Area
On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the belowmentioned times in the exhibition area. Lunch Breaks (Lunch Area)On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area. Tuesday, March 26, 2019
Wednesday, March 27, 2019
Thursday, March 28, 2019
