9.2 High-Level Synthesis

Printer-friendly version PDF version

Date: Thursday, March 28, 2019
Time: 08:30 - 10:00
Location / Room: Room 2

Chair:
Luciano Lavagno, Politecnico di Torino, IT, Contact Luciano Lavagno

Co-Chair:
Yuko Hara-Azumi, Tokyo Institute of Technology, JP, Contact Yuko Hara-Azumi

In this session, we show how high-level synthesis (HLS) can be used to protected IPs and to exploit high-level information to predict the outcome of the physical design. First, the protection of a high-level IP model in a cloud-based synthesis context is discussed by using functional locking. The second talk investigates how the concepts of hardware Trojans can be used during HLS to add watermarks to IPs. A novel approach is then proposed to estimate the routing congestion at physical level during HLS. The interactive presentation discusses how to estimate the hardware cost and the software performance for the hardware/software interface.

TimeLabelPresentation Title
Authors
08:309.2.1TRANSIENT KEY-BASED OBFUSCATION FOR HLS IN AN UNTRUSTED CLOUD ENVIRONMENT
Speaker:
Hannah Badier, ENSTA Bretagne, Lab-STICC, Brest, FR
Authors:
Hannah Badier1, Jean-Christophe Le Lann1, Philippe Coussy2 and Guy Gogniat3
1ENSTA Bretagne, FR; 2Universite de Bretagne-Sud / Lab-STICC, FR; 3Université Bretagne Sud, FR
Abstract
Recent advances in cloud computing have led to the advent of Business-to-Business Software as a Service (SaaS) solutions, opening new opportunities for EDA. High-Level Synthesis (HLS) in the cloud is likely to offer great opportunities to hardware design companies. However, these companies are still reluctant to make such a transition, due to the new risks of Behavioral Intellectual Property (BIP) theft that a cloud-based solution presents. In this paper, we introduce a key-based obfuscation approach to protect BIPs during cloud-based HLS. The source-to-source transformations we propose hide functionality and make normal behavior dependent on a series of input keys. In our process, the obfuscation is transient: once an obfuscated BIP is synthesized through HLS by a service provider in the cloud, the obfuscation code can only be removed at Register Transfer Level (RTL) by the design company that owns the correct obfuscation keys. Original functionality is thus restored and design overhead is kept at a minimum. Our method significantly increases the level of security of cloud-based HLS at low performance overhead. The average area overhead after obfuscation and subsequent de-obfuscation with tests performed on ASIC and FPGA is 0.39%, and over 95% of our tests had an area overhead under 5%.
09:009.2.2HIGH-LEVEL SYNTHESIS OF BENEVOLENT TROJANS
Speaker:
Christian Pilato, Politecnico di Milano, IT
Authors:
Christian Pilato1, Kanad Basu2, Mohammed Shayan2, Francesco Regazzoni3 and Ramesh Karri2
1Politecnico di Milano, IT; 2NYU, US; 3ALaRI, CH
Abstract
High-Level Synthesis (HLS) allows designers to create a register transfer level (RTL) description of a digital circuit starting from its high-level specification (e.g., C/C++/SystemC). HLS reduces engineering effort and design-time errors, allowing the integration of additional features. This study introduces an approach to generate benevolent Hardware Trojans (HT) using HLS. Benevolent HTs are Intellectual Property (IP) watermarks that borrow concepts from well-known malicious HTs to ward off piracy and counterfeiting either during the design flow or in fielded integrated circuits. Benevolent HTs are difficult to detect and remove because they are intertwined with the functional units used to implement the IP. Experimental results testify to the suitability of the approach and the limited overhead.
09:309.2.3MACHINE LEARNING BASED ROUTING CONGESTION PREDICTION IN FPGA HIGH-LEVEL SYNTHESIS
Speaker:
Jieru ZHAO, HKUST, CN
Authors:
Jieru Zhao1, Tingyuan Liang2, Sharad Sinha3 and Wei Zhang1
1Hong Kong University of Science and Technology, HK; 2HKUST, CN; 3Indian Institute of Technology Goa, IN
Abstract
High-level synthesis (HLS) shortens the development time of hardware designs and enables faster design space exploration at a higher abstraction level. Optimization of complex applications in HLS is challenging due to the effects of implementation issues such as routing congestion. Routing congestion estimation is absent or inaccurate in existing HLS design methods and tools. Early and accurate congestion estimation is of great benefit to guide the optimization in HLS and improve the efficiency of implementation. However, routability, a serious concern in FPGA designs, has been difficult to evaluate in HLS without analyzing post-implementation details after Place and Route. To this end, we propose a novel method to predict routing congestion in HLS using machine learning and map the expected congested regions in the design to the relevant high-level source code. This is greatly beneficial in early identification of routability oriented bottlenecks in the high-level source code without running time-consuming register-transfer level (RTL) implementation flow. Experiments demonstrate that our approach accurately estimates vertical and horizontal routing congestion with errors of 6.71% and 10.05% respectively. By presenting Face Detection application as a case study, we show that by discovering the bottlenecks in high-level source code, routing congestion can be easily and quickly resolved compared to the efforts involved in RTL level implementation and design feedback.
10:00IP4-10, 275ACCURATE COST ESTIMATION OF MEMORY SYSTEMS INSPIRED BY MACHINE LEARNING FOR COMPUTER VISION
Speaker:
Lorenzo Servadei, Infineon Technologies, DE
Authors:
Lorenzo Servadei1, Elena Zennaro1, Keerthikumara Devarajegowda1, Martin Manzinger1, Wolfgang Ecker1 and Robert Wille2
1Infineon AG, DE; 2Johannes Kepler University Linz, AT
Abstract
Hardware/software co-designs are usually defined at high levels of abstractions at the beginning of the design process in order to allow plenty of options how to eventually realize a system. This allows for design exploration which in turn heavily relies on knowing the costs of different design configurations (with respect to hardware usage as well as firmware metrics). To this end, methods for cost estimation are frequently applied in industrial practice. However, currently used methods for cost estimation oversimplify the problem and ignore important features - leading to estimates which are far off from the real values. In this work, we address this problem for memory systems. To this end, we borrow and re-adapt solutions based on Machine Learning (ML) which have been found suitable for problems from the domain of Computer Vision (CV) - in particular age determination of persons depicted in images. We show that, for an ML approach, age determination from the CV domain is actually very similar to cost estimation of a memory system.
10:01IP4-11, 658PRACTICAL CAUSALITY HANDLING FOR SYNCHRONOUS LANGUAGES
Speaker:
Steven Smyth, Kiel University, DE
Authors:
Steven Smyth, Alexander Schulz-Rosengarten and Reinhard von Hanxleden, Dept. of Computer Science, Kiel University, DE
Abstract
A key to the synchronous principle of reconciling concurrency with determinism is to establish at compile time that a program is causal, which means that there exists a schedule that obeys the rules put down by the language. In practice it can be rather cumbersome for the developer to cure causality problems. To facilitate causality handling, we propose, first, to enrich the scheduling regime of the language to also consider explicit scheduling directives that can be used by either the modeler or model-to-model transformations. Secondly, we propose to enhance programming environments with dedicated causality views to guide the developer in finding causality issues. Our proposals should be applicable for synchronous languages; we here illustrate them for the SCCharts language and its open source development platform KIELER.
10:02IP4-12, 998APPLICATION PERFORMANCE PREDICTION AND OPTIMIZATION UNDER CACHE ALLOCATION TECHNOLOGY
Speaker:
Yeseong Kim, UCSD, US
Authors:
Yeseong Kim1, Ankit More2, Emily Shriver2 and Tajana Rosing1
1University of California San Diego, US; 2Intel, US
Abstract
Many applications running on high-performance computing systems share limited resources such as the last-level cache, often resulting in lower performance. Intel recently introduced a new control mechanism, called cache allocation technology (CAT), which controls the cache size used by each application. To intelligently utilize this technology for automated management, it is essential to accurately identify application performance behavior for different cache allocation scenarios. In this work, we show a novel approach which automatically builds a prediction model for application performance changes with CAT. We profile the workload characteristics based on Intel Top-down Microarchitecture Analysis Method (TMAM), and train the model using machine learning. The model predicts instructions per cycle (IPC) across available cache sizes allocated for the applications. We also design a dynamic cache management technique which utilizes the prediction model and intelligently partitions the cache resource to improve application throughput. We implemented and evaluated the proposed framework in Intel PMU profiling tool running on Xeon Platinum 8186 Skylake processor. In our evaluation, we show that the proposed model accurately predicts the IPC changes of applications with 4.7% error on average for different cache allocation scenarios. Our predictive online cache managements achieves improvements on application performance of up to 25% as compared to a prediction-agnostic policy.
10:03IP4-13, 910GENERALIZED MATRIX FACTORIZATION TECHNIQUES FOR APPROXIMATE LOGIC SYNTHESIS
Speaker:
Sherief Reda, Brown University, US
Authors:
Soheil Hashemi and Sherief Reda, Brown University, US
Abstract
Approximate computing is an emerging computing paradigm, where computing accuracy is relaxed for improvements in hardware metrics, such as design area and power profile. In circuit design, a major challenge is to synthesize approximate circuits automatically from input exact circuits. In this work, we extend our previous work, BLASYS, for approximate logic synthesis based on matrix factorization, where an arbitrary input circuit can be approximated in a controlled fashion. Whereas our previous approach uses a semi-ring algebra for factorization, this work generalizes matrix-based circuit factorization to include both semi-ring and field algebra implementations. We also propose a new method for truth table folding to improve the factorization quality. These new approaches significantly widen the design space of possible approximate circuits, effectively offering improved trade-offs in terms of quality, area and power consumption. We evaluate our methodology on a number of representative circuits showcasing the benefits of our proposed methodology for approximate logic synthesis.
10:00End of session
Coffee Break in Exhibition Area



Coffee Breaks in the Exhibition Area

On all conference days (Tuesday to Thursday), coffee and tea will be served during the coffee breaks at the below-mentioned times in the exhibition area.

Lunch Breaks (Lunch Area)

On all conference days (Tuesday to Thursday), a seated lunch (lunch buffet) will be offered in the Lunch Area to fully registered conference delegates only. There will be badge control at the entrance to the lunch break area.

Tuesday, March 26, 2019

Wednesday, March 27, 2019

Thursday, March 28, 2019