IP1_4 Interactive Presentations
Date: Tuesday, 02 February 2021
Time: 09:50 - 10:20 CET
Virtual Conference Room: https://virtual21.date-conference.com/meetings/virtual/drZX49ChBZw5F3Et8
Interactive Presentations run simultaneously during a 30-minute slot. Additionally, each IP paper is briefly introduced in a one-minute presentation in a corresponding regular session
|IP1_4.1||OPPORTUNISTIC IP BIRTHMARKING USING SIDE EFFECTS OF CODE TRANSFORMATIONS ON HIGH-LEVEL SYNTHESIS
Christian Pilato, Politecnico di Milano, IT
Hannah Badier1, Christian Pilato2, Jean-Christophe Le Lann3, Philippe Coussy4 and Guy Gogniat5
1ENSTA Bretagne, FR; 2Politecnico di Milano, IT; 3ENSTA-Bretagne, FR; 4Universite de Bretagne-Sud / Lab-STICC, FR; 5Université Bretagne Sud, FR
The increasing design and manufacturing costs are leading to globalize the semiconductor supply chain. However, a malicious attacker can resell a stolen Intellectual Property (IP) core, demanding methods to identify a relationship between a given IP and a potentially fraudulent copy. We propose a method to protect IP cores created with high-level synthesis (HLS): our method inserts a discrete birthmark in the HLS-generated designs that uses only intrinsic characteristics of the final RTL. The core of our process leverages the side effects of HLS due to specific source-code manipulations, although the method is HLS-tool agnostic. We propose two independent validation metrics, showing that our solution introduces minimal resource and delay overheads (<6% and <2%, respectively) and the accuracy in detecting illegal copies is above 96%.
|IP1_4.2||EFFICIENT TENSOR CORES SUPPORT IN TVM FOR LOW-LATENCY DEEP LEARNING
Wei Sun, Eindhoven University of Technology, NL
Wei Sun1, Savvas Sioutas1, Sander Stuijk1, Andrew Nelson2 and Henk Corporaal3
1Eindhoven University of Technology, NL; 2TU Eindhoven, NL; 3TU/e (Eindhoven University of Technology), NL
Deep learning algorithms are gaining popularity in autonomous systems. These systems typically have stringent latency constraints that are challenging to meet given the high computational demands of these algorithms. Nvidia introduced Tensor Cores (TCs) to speed up some of the most commonly used operations in deep learning algorithms. Compilers (e.g., TVM) and libraries (e.g., cuDNN) focus on the efficient usage of TCs when performing batch processing. Latency sensitive applications can however not exploit large batch processing. This paper presents an extension to the TVM compiler that generates low latency TCs implementations particularly for batch size 1. Experimental results show that our solution reduces the latency on average by 14% compared to the cuDNN library on a Desktop RTX2070 GPU, and by 49% on an Embedded Jetson Xavier GPU.