FM01.1 PhD Forum

Date: Monday, 01 February 2021
Time: 17:00 - 19:00 CET
Virtual Conference Room: https://virtual21.date-conference.com/meetings/virtual/FtLuDBwq5KDuvpHzd

Session Chair:
Robert Wille, Johannes Kepler University Linz, AT

All registered conference delegates and exhibition visitors are kindly invited to join the DATE 2021 PhD Forum, which will take place on Monday from 17:00 - 19:00 at the DATE 2021 venue.

The PhD Forum of the DATE Conference is a poster session hosted by the European Design Automation Association (EDAA), the ACM Special Interest Group on Design Automation (SIGDA), and the IEEE Council on Electronic Design Automation (CEDA). The purpose of the PhD Forum is to offer a forum for PhD students to discuss their thesis and research work with people of the design automation and system design community. It represents a good opportunity for students to get exposure on the job market and to receive valuable feedback on their work.

To this end, the forum takes place in two parts:

  • First, everybody is invited to an opening session of the PhD Forum, where all presenters will present their work by means of a 1min pitch.
  • After that (at approx. 17:30), all presenters will present their work within a 1.5 hour "poster" presentation in separate rooms. Within this timeframe, everyone can enter and leave the respective rooms and engage in corresponding discussions.

Furthermore, for each presentation, a poster (in pdf) summarizing the presentation will be provided.

Time Label Presentation Title
Authors
17:00 CET OPENING OF THE PHD FORUM
Speaker:
Robert Wille, Johannes Kepler University Linz, AT
17:00 CET FM01.1.1 EXPLOITING ERROR RESILIENCE OF ITERATIVE AND ACCUMULATION BASED ALGORITHMS FOR HARDWARE EFFICIENCY
Speaker and Author:
Dr. G.A. Gillani, University of Twente, NL
Abstract
While the efficiency gains due to process technology improvements are reaching the fundamental limits of computing, emerging paradigms like approximate computing provide promising efficiency gains for error resilient applications. However, the state-of-the-art approximate computing methodologies do not sufficiently address the accelerator designs for iterative and accumulation based algorithms. Keeping in view a wide range of such algorithms in digital signal processing, this thesis investigates systematic approximation methodologies to design high-efficiency accelerator architectures for iterative and accumulation based algorithms. As a case study of such algorithms, we have applied our proposed approximate computing methodologies to a radio astronomy calibration application.

More information ...
17:00 CET FM01.1.2 IMPROVING ENERGY EFFICIENCY OF NEURAL NETWORKS
Speaker and Author:
Seongsik Park, Seoul National University, KR
Abstract
Deep learning with neural networks has shown remarkable performance in many applications. However, this success of deep learning is based on a tremendous amount of energy consumption, which becomes one of the major obstacles to deploying the deep learning model on mobile devices. To address this issue, many researchers have studied various methods for improving the energy efficiency of the neural networks to expand the applicability of deep learning. This dissertation is in line with those studies and contains mainly three approaches, including quantization, energy-efficient accelerator, and neuromorphic approach.

More information ...
17:00 CET FM01.1.3 DESIGN, IMPLEMENTATION AND ANALYSIS OF EFFICIENT HARDWARE-BASED SECURITY PRIMITIVES
Speaker and Author:
Nalla Anandakumar Nachimuthu, University of Florida, US
Abstract
Hardware-based security primitives play important roles in protecting and securing a system in Internet of Things (IoT) applications. The main primitives are physical unclonable functions (PUF) and true random number generator (TRNG) studied in this paper. Efficient FPGA implementation are proposed in the work along with relevant security analysis using prevalent metrics. Finally, an application of designed TRNG and PUF is proposed for implementing an authenticated key agreement protocol.

More information ...
17:00 CET FM01.1.4 FORMAL ABSTRACTION AND VERIFICATION OF ANALOG CIRCUITS
Speaker and Author:
Ahmad Tarraf, research assistant uni frankfurt, DE
Abstract
In the recently submitted dissertation the formal abstraction and verification of analog circuit is examined. The dissertation aims to contribute to the formal verification of AMS circuits by generating accurate behavioral models that can be used for verification. As accurate behavioral models are often handwritten, this dissertation proposes an automatic abstraction method based on sampling a Spice netlist at transistor level with full Spice BSIM accuracy. The approach generates a hybrid automaton (HA) that exhibits a linear behavior described by a state space representation in each of its locations, thereby modeling the nonlinear behavior of the netlist via multiple locations. Hence, due to the linearity of the obtained model, the approach is easily scalable. The HAs can be deployed in various output languages: Matlab, Verilog-A, and SystemC-AMS. Various extensions exist for the models enhancing their exhibited behavior.

More information ...
17:00 CET FM01.1.5 OPTIMIZATION TOOLS FOR CONVNETS ON THE EDGE
Speaker:
Valentino Peluso, Politecnico di Torino, IT
Authors:
Valentino Peluso, Enrico Macii and Andrea Calimera, Politecnico di Torino, IT
Abstract
The shift of Convolutional Neural Networks (ConvNets) into low-power devices with limited compute and memory resources calls for cross-layer strategies spanning from hardware to software optimization. This work answers to this need, presenting a collection of tools for efficient deployment of ConvNets on the edge.

More information ...
17:00 CET FM01.1.6 DESIGN SPACE EXPLORATION IN HIGH LEVEL SYNTHESIS
Speaker and Author:
Lorenzo Ferretti, Università della Svizzera italiana, CH
Abstract
High Level Synthesis (HLS) is a process which, starting from a high-level description of an application (C/C++), generates the corresponding RTL code describing the hardware implementation of the desired functionality. The HLS process is usually controlled by user-given directives (e.g., directives to set whether or not to unroll a loop) which influence the resulting implementation area and latency. By using HLS, designers are able to rapidly generate different hardware implementations of the same application, without the burden of directly specifying the low level implementation in detail. Nonetheless, the correlation among directives and resulting performance is often difficult to foresee and to quantify, and the high number of available directives leads to an exponential explosion in the number of possible configurations. In addition, sampling the design space involves a time-consuming hardware synthesis, making a brute-force exploration infeasible beyond very simple cases. However, for a given application, only few directive settings result in Pareto-optimal solutions (with respect to metrics such as area, run-time and power), while most are dominated. The design space exploration problem aims at identifying close to Pareto-optimal implementations while synthesising only a small portion of the possible configurations from the design space. In my Ph.D. dissertation I present an overview of the HLS design flow, followed by a discussion about existing strategies in literature. Moreover, I present new exploration methodologies able to automatically generate optimised implementations of hardware accelerators. The proposed approaches are able to retrieve a close approximation of the real Pareto solutions while synthesising only a small fraction of the possible design, either by smartly navigating their design space or by leveraging prior knowledge. I also present a database of design space explorations whose goal is to push the research boundaries by offering to researchers a tool for the standardisation of exploration evaluation, and a reliable source of knowledge for machine learning based approaches. Lastly, the stepping-stones of a new approach relying on deep learning strategies with graph neural networks is presented.

More information ...
17:00 CET FM01.1.7 RELIABILITY IMPROVEMENT OF STT-MRAM CACHE MEMORIES IN DATA STORAGE SYSTEMS
Speaker:
Elham Cheshmikhani, Sharif University of Technology, IR
Authors:
Elham Cheshmikhani1, Hamed Farbeh2 and Hossein Asadi1
1Sharif University of Technology, IR; 2Amirkabir University of Technology, IR
Abstract
Spin-Transfer Torque Magnetic RAM (STT-MRAM) is known as the most promising replacement for SRAM technology in cache memories. Despite its high-density, non-volatility, near-zero leakage power, and immunity to radiation-induced particle strikes as the major advantages, STT-MRAM-based cache memory suffers from high error rates mainly due to retention failure, read disturbance, and write failure. Despite its high-density, non-volatility, near-zero leakage power, and immunity to radiation as the major advantages, STT-MRAM suffers from high error rates. These errors, which are mainly retention failure, read disturbance, and write failure, are the major reliability challenge in STT-MRAM caches. Existing studies are limited to estimate the rate of only one or two of these error types for STT-MRAM cache. However, the overall vulnerability of STT-MRAM caches, which its estimation is a must to design cost-efficient reliable caches has not been offered in none of previous studies. Meanwhile, all of the existing reliability improvement schemes in STT-MRAM caches are limited to overcome a single or two error types and the majority of them have adverse effect on other error types. In this dissertation, we first propose a system-level framework for reliability exploration and characterization of errors behavior in STT-MRAM caches. To this end, we formulate the cache vulnerability considering the inter-correlation of the error types including retention failure, read disturbance, and write failure as well as the dependency of error rates to workloads behavior and Process Variations (PVs). Then, we investigate the effect of temperature on STT-MRAM cache error rate and demonstrate that heat accumulation increases the error rate by 110.9 percent. We also illustrate that this heat accumulation is mainly due to locality of committed write operations in the cache. In addition, we demonstrate that a) extra read accesses to data and tag arrays, which are imposed to enhance the cache access time significantly increase the read disturbance error rate; and b) the diversity in the number of `1's and switching in codewords of a data block significantly degrades the protection capability of error correcting codes. We also propose a new cache architecture, so-called Reliability-Optimized STT-MRAM Memory (ROSTAM), to customize different parts of the cache structure for reliability enhancement. ROSTAM consists of four components: 1) a simple yet effective replacement policy, called TA-LRW, to prevent the heat accumulation in the cache and reduce the rate of all the three error types, 2) a novel tag array structure, so-called 3RSeT to reduce the error rate by eliminating a significant portion of tag reads, 3) an effective scheme, so-called REAP-Cache, to prevent the accumulation of read disturbance in cache blocks and completely eliminate the adverse effect of concealed reads on the cache reliability, and 4) a new ECC configuration, so-called ROBIN, to uniformly distribute the transitions between the codewords and maximize the ECC correction capability. We compare the proposed architecture with an 8-way L2 cache protected by SEC-DED(72,64) and using LRU policy. The experimental results using gem5 full-system simulator and a comprehensive set of multi-programmed workloads from SPEC CPU2006 benchmark suite on a quad-core processor show that: 1) the rate of read disturbance error is reduced by 4966.1x, which is achieved by integrating TA-LRW, 3RSeT, ROBIN, and REAP Cache, 2) write failure is reduced by 3.7x, which is the effect of TA-LRW and ROBIN, 3) retention failure rate is reduced by 8.1x because of TA-LRW and REAP Cache operations, and 4) total error rate considering all error types is reduced by 10x. The significantly reliability enhancement is achieved in the cost of less than 2.7% increase in energy consumption, less than 1% area overhead, and an average of 2.3% performance degradation.

More information ...
17:00 CET FM01.1.8 ENABLING LOGIC-MEMORY SYNERGY USING INTEGRATED NON-VOLATILE TRANSISTOR TECHNOLOGIES FOR ENERGY-EFFICIENT COMPUTING
Speaker and Author:
Sandeep Krishna Thirumala, Purdue University, US
Abstract
Over the last decade, there has been an immense interest in the quest for emerging memory technologies which possess distinct advantages over traditional silicon-based memories. In the era of big-data, a key challenge is to achieve close integration of logic and memory sub-systems, to overcome the von-Neumann bottleneck associated with the long-distance data transmission between logic and memory. Moreover, brain-inspired deep neural networks which have transformed the field of machine learning in recent years, are not widely deployable in edge devices, mainly due to the aforementioned bottleneck. Therefore, there exists a need to explore solutions with tight logic-memory integration, in order to enable efficient computation for current and future generation of systems. Motivated by this, in this thesis, we harness the benefits offered by emerging technologies and propose devices, circuits, and systems which exhibit an amalgamation of logic and memory functionality. We propose two variants of memory devices: (a) Reconfigurable Ferroelectric transistors and (b) Valley-Coupled-Spin Hall effect-based magnetic random access memory, which exhibit unique logic-memory unification. Exploiting the intriguing features of the proposed devices, we carry out a cross-layer exploration from device-to-circuits-to-systems for energy-efficient computing. We investigate a wide spectrum of applications for the proposed devices including embedded memories, non-volatile logic, compute-in-memory fabrics and artificial intelligence systems. Overall, evaluation results of the proposed device-circuit-system techniques in this thesis, show significant reduction in energy consumption along with performance improvement of various systems when compared to conventional von Neumann-based approaches for several application workloads, addressing the critical need for logic-memory synergy in current/next-generation of computing.

More information ...
17:00 CET FM01.1.9 HARDWARE SECURITY IN DRAMS AND PROCESSOR CACHES
Speaker and Author:
Wenjie Xiong, Facebook AI Research, US
Abstract
The cost reduction and performance improvement of silicon chips have made computing devices ubiquitous, from IoT to cloud servers. These devices have been deployed to collect and process an unprecedented amount of data around us. Also, to make full use of resources, often the system is shared among different applications. This raises a lot of security and privacy concerns. Meanwhile, memory and processor caches are essential components of modern computers, but they have been mainly designed for their functionality and performance, not for security. There are potential positive uses of hardware components that can improve security, but also, there are security attacks that make use of the vulnerabilities in hardware. This dissertation consequently studies both the positive and negative security aspects of Dynamic Random Access Memories (DRAMs) and caches on commercial devices. The proposed DRAM Physically Unclonable Functions (PUFs) can be deployed today for higher security, especially in low-end IoT devices and embedded systems currently utilized in health care, home automation, transportation, or energy grids, which lack other security mechanisms. The discovered cache LRU covert-channel attacks and DRAM temperature spying attacks show new types of vulnerabilities in today's systems, motivating new designs to protect applications in a shared system and to prevent malicious use of the physical features of the hardware.

More information ...
17:00 CET FM01.1.11 LESS IS MORE: EFFICIENT HARDWARE DESIGN THROUGH APPROXIMATE LOGIC SYNTHESIS
Speaker and Author:
Ilaria Scarabottolo, USI Lugano, CH
Abstract
As energy efficiency becomes a crucial concern in almost every kind of digital application, Approximate Computing gains popularity as a potential answer to this ever-growing energy quest. Approximate Computing is a design paradigm particularly suited for error-resilient applications, where small losses in accuracy do not represent a significant reduction in the quality of the result. In these scenarios, energy consumption and resources employment (such as electric power, or circuit area) can be significantly improved at the expense of a slight reduction in output accuracy. While Approximate Computing can be applied at different levels, my research focuses on the design of approximate hardware. In particular, my work explores Approximate Logic Synthesis, where the hardware functionality is automatically tuned to obtain more efficient counterparts, while always controlling the entailed error. Functional modifications include, among others, removal or substitution of gates and signals. A fundamental prerequisite for the application of these modifications is an accurate error model of the circuit under exam. My Ph.D. research work has deeply concentrated on the derivation of accurate error models of a circuit. These can, in turn, guide Approximate Logic Synthesis algorithms to optimal solutions and avoid expensive, time-consuming simulations. A precise error model allows to fully explore the design space and, potentially, adjust the desired level of accuracy even at runtime. I have also contributed to the state of the art in ALS techniques by devising a circuit pruning algorithm that produces efficient approximate circuits for given error constraints. The innovative aspect of my work is that it exploits circuit topology and graph partitioning to identify circuit portions that impact to a smaller extent on the final output. With this information, ALS algorithms can improve their efficiency by acting first on those less-influent portions. Indeed, this error characterisation proves to be very effective in guiding and modeling approximate synthesis.

More information ...
17:00 CET FM01.1.12 LONGLIVENOC: WEAR LEVELLING, WRITE REDUCTION AND SELECTIVE VC ALLOCATION FOR LONG LASTING DARK SILICON AWARE NOC INTERCONNECTS
Speaker and Author:
Khushboo Rani, IIT Guwahati, IN
Abstract
With the continuing advancement in semiconductor technologies, more and more cores are integrated on the same die that leads to the concept of Chip Multi-processor. The communication across these multiple cores is facilitated by the switch-based Network-on-Chip (NoC) for efficient and bursty on-chip communication. The power and performance of these interconnect is a significant factor as the communication network consumes a considerable share of the power budget. In particular, the buffers used at every port of the NoC router consume considerable dynamic as well as static power. It has been noticed that communication consumes almost 36% of the total chip power. With tighter power budgets and to meet the thermal design power (TDP) for the system, components like the cores/caches undergo voltage and frequency scaling and at times, power off. Powering off several components to stay within the TDP leads to the concept of dark silicon. In dark silicon, although the cores/caches are off, the communication network is expected to be available. In order to reduce the standby power of the network in such events, one looks for avenues in non-volatile memory (NVM) technologies. NVM technologies such as spin-transfer torque random access memory (STT-RAM), offer many advantages over conventional SRAM technology. These advantages include high density, good scalability, and low leakage power consumption. However, the buffers made from these memory technologies suffer from costly write operation and low write endurance. Thus, in my PhD research, I proposed wear-levelling and write reduction techniques to enhance the lifetime and reduce the effect of the costly write operation of NVM buffers in the dark silicon scenario. We evaluate our proposed approaches on a multi-core full system simulator Gem5, with Garnet2.0 as the interconnection network model for NoC performance. We evaluate our work with PARSEC and SPEC benchmark suites.

More information ...
17:00 CET FM01.1.13 ENERGY EFFICIENT AND RUNTIME BASED APPROXIMATE COMPUTING TECHNIQUES FOR IMAGE COMPRESSION APPLICATION: AN INTEGRATED APPROACH COVERING CIRCUIT TO ALGORITHMIC LEVEL
Speaker:
Junqi Huang, University of Nottingham Malaysia, MY
Authors:
Junqi Huang1, Nandha kumar Thulasiraman2 and Haider Abbas Almurib1
1University of Nottingham Malaysia, MY; 2University of Nottingham, MY
Abstract
Approximate computing has been widely used in error resilient design for improving the energy performance by reducing circuit complexity and allowing circuits to produce acceptable error results (approximation). Generally, the approximate computing techniques have been developed and implemented either at algorithmic level or logic level or circuit level and with no feasibility of on-the-fly or Runtime change of approximation. Thus, different from the existing methods, this thesis presents novel energy-efficient integrated approach of implementing approximate computing techniques from circuit level to the algorithmic level that incorporate the change of approximation for a given circuit at Runtime without incurring any extra hardware requirement. The two new techniques are known as Frequency upscaling (FUS) technique and Voltage over scaling (VOS) technique. Meanwhile, these two new techniques developed for the logic/circuit level abstract are integrated into a new proposed algorithmic level approximate computing technique known as zigzag low-complexity approximate DCT (ZLCADCT). Thus, developing an integrated approach of implementing runtime based approximate computing technique from circuit level abstract to algorithmic level abstract for image compression application.

More information ...
17:00 CET FM01.1.14 THESIS: PERFORMANCE AND PHYSICAL ATTACK SECURITY OF LATTICE-BASED CRYPTOGRAPHY
Speaker and Author:
Felipe Valencia, Univesità della Svizzera Italiana, CH
Abstract
This thesis addresses two problems that limit the widespread of LBC: 1) the physical security of real world implementations, where this thesis focuses on fault attacks, and 2) the not always satisfactory performance of lattice-based cryptography, focusing on accelerators and instruction set extensions.

More information ...
17:00 CET FM01.1.15 AMOEBA-INSPIRED SYSTEM CONTROLLER ON IOT EDGE
Speaker:
Anh Nguyen, Tokyo Institute of Technology, JP
Authors:
Anh Nguyen and Yuko Hara-Azumi, Tokyo Institute of Technology, JP
Abstract
This work aims at developing a light-weight yet efficient controller for IoT systems on the edge devices. The controller bases on a recent emerging computing model inspired by an amoeba to solve the Satisfiability problems (SAT), which can represent various IoT applications. Realizing the massive parallelism feature of this amoeba-inspired SAT solver, AmoebaSAT, we conducted its FPGA-based hardware implementations through a hardware/software co-design approach. By extending the original algorithm to help the solver escape local minima more quickly and utilizing the community structure of different IoT applications, we developed a high efficient IoT controller which well understands the characteristics of different application domains and outperformed state-of-the-arts.

More information ...
17:00 CET FM01.1.16 MONITORING AND CONTROLLING INTERCONNECT CONTENTION IN CRITICAL REAL-TIME SYSTEMS
Speaker:
Jordi Cardona, Barcelona Supercomputing Center and Universitat Politecnica de Catalunya, ES
Authors:
Jordi Cardona1, Carles Hernandez2, Enrico Mezzetti3, Jaume Abella4 and Francisco J Cazorla5
1Barcelona Supercomputing Center and Universitat Politecnica de Catalunya, ES; 2Universitat Politècnica de València, ES; 3Barcelona Supercomputing Center (BSC), ES; 4Barcelona Supercomputing Center (BSC-CNS), ES; 5Barcelona Supercomputing Center, ES
Abstract
Computing performance needs in critical real-time systems (CRTS) domains such as automotive, avionics, railway, and space are on the rise. This is fueled by the trend towards implementing an increasing number of product functionalities in software that ends up managing huge amounts of data and implementing complex artificial-intelligence functionalities such as Advanced Driver Assistance Systems. Manycores are able to satisfy, in a cost-efficient manner, the computing needs of embedded real-time industry. In this line, building as much as possible on manycore solutions deployed in the high-performance (mainstream) market, contribute to further reduce costs and increase availability. However, commercial off the shelf (COTS) manycores bring several challenges for their adoption in the critical embedded market. One of those is deriving timing bounds to tasks’ execution times as part of the overall timing validation and verification processes. In particular, the network-on-chip (NoC) has been shown to be the main resource in which contention arises, and hence hampers deriving tight bounds to the timing of tasks. In this extended abstract ,we will show our proposed hardware/software solutions to reduce the worst-case execution time (WCET) of applications optimizing the NoC setup parameters and also our developed techniques to measure and control contention (first in centralized NoCs and later in distributed NoCs systems).

More information ...
17:00 CET FM01.1.17 RELIABILITY CONSIDERATIONS IN THE USE OF HIGH-PERFORMANCE PROCESSORS IN SAFETY-CRITICAL SYSTEMS
Speaker:
Sergi Alcaide, Universitat Politècnica de Catalunya - Barcelona Supercomputing Center (BSC), ES
Authors:
Sergi Alcaide1, Leonidas Kosmidis2, Carles Hernandez3 and Jaume Abella4
1Universitat Politècnica de Catalunya - Barcelona Supercomputing Center (BSC), ES; 2Barcelona Supercomputing Center (BSC), ES; 3Universitat Politècnica de València, ES; 4Barcelona Supercomputing Center (BSC-CNS), ES
Abstract
High-Performance Computing (HPC) platforms are a must in Autonomous Driving (AD) systems due to the tremendous jump in performance required. However, since HPC components are not designed following the development process used in the automotive domain, some safety requirements are not met by default on those platforms. The automotive functional safety standard, ISO 26262, stipulates that automotive platforms must avoid Common Cause Failures (CCFs), i.e. any single fault that can cause a failure despite safety measures in place. CCFs can be avoided by enforcing diverse redundancy (e.g. lockstep execution), so that a single fault affecting redundant elements (e.g. a voltage droop) does not produce the same error in those redundant elements. This thesis presents software and hardware techniques to achieve a diverse redundant execution in multiple HPC components to enable their usage in the automotive domain.

More information ...
17:00 CET FM01.1.18 HARDWARE SECURITY EVALUATION OF IOT EMBEDDED APPLICATIONS
Speaker and Author:
zahra kazemi, PhD. Candidate, FR
Abstract
In recent years, the broad adoption and accessibility of the Internet of Things (IoT) have created major concerns for the manufacturers and enterprises in the hardware security domain. The importance of software developers’ role in the evaluation of the system’s security has raised along with the demand for shortening the time to market and development cost. However, embedded software developers often lack the knowledge to consider the hardware-based threats and their effects on important assets. To overcome such challenges, it is essential for the security specialists to provide the embedded developers with practical necessary tools and evaluation methods against hardware-based attacks. In this thesis work, we develop an evaluation methodology and an easy to use hardware security assessment framework, against major physical attacks ( e.g. side-channel and fault injection attacks). It can assist the software developers to detect their system vulnerabilities and to protect important assets. This work can also guide on implementing software-level countermeasures, which can reduce the effects of the physical attack’s risks to an acceptable level. As a case study, we apply our approach to an IoT medical application named “SecPump” that models an infusion pump in the hospitals. This study mimics a real experimental evaluation process and highlights the potential risks of ignoring the physical attacks.

More information ...
17:00 CET FM01.1.19 A COMPUTER-AIDED DESIGN SPACE EXPLORATION FOR DEPENDABLE CIRCUITS
Speaker and Author:
Stefan Scharoba, Brandenburg University of Technology, DE
Abstract
This thesis presents an automated toolset for exploring design choices which provide fault tolerance by means of hardware redundancy. Based on a given VHDL model, various fault tolerant implementations can be automatically created and evaluated regarding their overhead and reliability improvement.

More information ...
17:00 CET FM01.1.20 ROBUST AND ENERGY-EFFICIENT DEEP LEARNING SYSTEMS
Speaker and Author:
Muhammad Abdullah Hanif, Institute of Computer Engineering, Vienna University of Technology, AT
Abstract
Deep Learning (DL) has evolved to become the state-of-the-art machine learning algorithm for many AI applications such as image classification, object detection, object segmentation, voice recognition, and language translation. Due to the state-of-the-art accuracy of the models generated through DL, i.e., Deep Neural Networks (DNNs), they are also being adopted for safety-critical applications, e.g., autonomous driving, healthcare, and security & surveil-lance. Besides energy efficiency, for safety-critical applications, reliability against technology-induced faults (e.g., soft errors, device aging, and manufacturing defects) is one of the foremost concerns, as even a single neglected fault at a critical location can result in a significant drop in the application-level accuracy. This Ph.D. work aims at studying and exploiting the unique error-resilience characteristics of DNNs to improve their robustness against the technology-induced reliability threats at low overhead cost. This work also improves the power/performance/energy-efficiency of the systems through judicious approximations (i.e., carefully crafted designer-induced errors in less-sensitive neurons) that can be tolerated due to error-resilience characteristics of DNNs and can be leveraged to compensate for the overheads of reliability features, or alternatively, be spent for enhancing reliability levels.

More information ...
17:00 CET FM01.1.21 AUTOMATED DESIGN OF APPROXIMATE ACCELERATORS
Speaker and Author:
Jorge Castro-Godínez, Karlsruhe Institute of Technology (KIT), DE
Abstract
Approximate computing has emerged as a design paradigm suitable for applications with inherent error resilience. This paradigm aims to reduce the computing costs of exact calculations by lowering the accuracy of their results. In the last decade, many approximate circuits, particularly approximate adders and multipliers, have been reported in the literature. For an ongoing number of such approximate circuits, selecting those that minimize the required resources for designing and generating an approximate accelerator from a high-level specification while satisfying a previously defined accuracy constraint is a joint design space exploration and high-level synthesis challenge. This dissertation proposes automated methods for designing and implementing approximate accelerators built with approximate arithmetic circuits.

More information ...
17:00 CET FM01.1.22 NEXT GENERATION DESIGN FOR TESTABILITY, DEBUG AND RELIABILITY USING FORMAL TECHNIQUES
Speaker and Author:
Sebastian Huhn, University of Bremen, DE
Abstract
Several improvements in the Electronic Design Automation (EDA) flow enabled the design of highly complex Integrated Circuits (ICs). This complexity has been introduced to address the challenging intended application scenarios, for instance, in automotive systems, which typically require several heterogeneous functions to be jointly implemented on-chip at once. On the one hand, the complexity scales with the transistor count and, on the other hand, further non-functional aspects have to be considered, which leads to new demanding tasks during the state-of-the-art IC design and test. Thus, new measures are required to achieve the required level of testability, debug and reliability of the resulting circuit. This thesis proposes several novel approaches to, in the end, pave the way for the next generation of IC, which can be successfully and reliable integrated even in safety-critical applications. In particular, this thesis combines formal techniques - like the Satisfiability (SAT) problem and the Bounded Model Checking (BMC) - to address the arising challenges concerning the increase in Test Data Volume (TDV) as well as Test Application Time (TAT) and the required reliability. One contribution concerns the development of Test Vector Transmitting using enhanced compression-based TAP controllers (VecTHOR). VecTHOR proposes a newly designed compression architecture, which combines a codeword-based compression, a dynamically configurable dictionary and a run-length encoding scheme. VecTHOR fulfills a lightweight character and is seamlessly integrated within an IEEE 1149.1 Test Access Port (TAP) controller. VecTHOR achieves a significant reduction of the TDV and the TAT by 50%, which directly reduces the resulting test costs. Another contribution concerns the design and implementation of a retargeting framework to process existing test data off-chip once prior-to the transfer without the need for an expensive test regeneration. Different techniques have been implemented to provide choosable trade-offs between the resulting the TDV as well as the TAT and the required run-time of the retargeting process. These techniques include a fast heuristic approach and a formal optimization SAT-based method by invoking multiple objective functions. Besides this, one contribution concerns the development of a hybrid embedded compression architecture, which is specifically designed for Low-Pin Count Test (LPCT) in the field of safety-critical systems enforcing a zero-defect policy. This hybrid compression has been realized in close industrial cooperation with Infineon Germany. This approach allows reducing the resulting test time by a factor of approx. three. A further contribution is about the development of a new methodology to significantly enhance the robustness of sequential circuits against transient faults while neither introducing a large hardware overhead nor measurably impacting the latency of the circuit. Application-specific knowledge is conducted by applying SAT-based techniques as well as BMC to achieve this, which yields the synthesis of a highly efficient fault detection mechanism. The proposed techniques are presented in detail and evaluated extensively by considering industrial-representative candidates, which clearly demonstrated the proposed approaches' efficacy.

More information ...
17:00 CET FM01.1.23 DESIGN AUTOMATION FOR FIELD-COUPLED NANOTECHNOLOGIES
Speaker:
Marcel Walter, University of Bremen, DE
Authors:
Marcel Walter1 and Rolf Drechsler2
1University of Bremen, DE; 2University of Bremen/DFKI, DE
Abstract
Circuits based on complementary metal-oxide-semiconductors (CMOS) enabled the digital revolution and still provide the basis for almost all computational devices to this date. Nevertheless, the class of Field-coupled Nanocomputing (FCN) technologies is a promising candidate to outperform CMOS circuitry in various metrics. Not only does FCN process binary information inherently, but it also allows for absolute low-power in-memory computing with an energy dissipation that is magnitudes below that of CMOS. However, physical design for FCN technologies is still in its infancy. In this Student Research Forum Proposal, a complete flow for the physical design of FCN circuitry is presented. This includes exact and heuristic techniques for placement, routing, clocking, and timing, formal verification, and debugging. All proposed algorithms have been made publicly available in a holistic framework called fiction.

More information ...
17:00 CET FM01.1.24 HARDWARE AND SOFTWARE TECHNIQUES FOR SECURING INTELLIGENT CYBER-PHYSICAL SYSTEMS
Speaker and Author:
Faiq Khalid, TU Wien, AT
Abstract
This Ph.D. work aims to design a robust intelligent CPS against the hardware-level security attacks (e.g., hardware Trojans, communication network attacks for VANET) and software-level security attacks (e.g., adversarial attacks on ML-based components in CPS). Towards this goal, this work studies and analyzes the security vulnerabilities at hardware- and software levels to identify the potentially vulnerable components, SoCs, or systems in CPS. Based on these analyses, this work improves the security of the CPS by deploying efficient and low-overhead solutions. These solutions can either identify the potential attacks during run-time or provide an efficient defend against these attacks.

More information ...