IP1 Interactive Presentations

Label	Presentation Title Authors
IP1-1	A SCALABLE LANE DETECTION ALGORITHM ON COTSS WITH OPENCL Speaker: Kai Huang, Sun Yat-Sen University, CN Authors: Kai Huang¹, Biao Hu², Jan Botsch³, Nikhil Madduri³ and Alois Knoll³ ¹Sun Yat-Sen University, CN; ²Technische Universität München (TUM), DE; ³Technische Universität München (TUM), DE Abstract Road lane detection are classical requirements for advanced driving assistant systems. With new computer technologies, lane detection algorithms can be exploited on Cots platforms. This paper investigates the use of OpenCL and develop a particle- filter based lane detection algorithm that can tune the trade-off between detection accuracy and speed. Our algorithm is tested on 14 video streams from different data-sets with different scenarios on different Cots hardware. With an average deviation fewer than 5 pixels, the average frame rates for the 14 videos can reach about 400 fps on both Gpu and Fpga. The peak frame rates for certain videos on GPU can reach almost 1000 fps. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-2	SIMULATION OF FALLING RAIN FOR ROBUSTNESS TESTING OF VIDEO-BASED SURROUND SENSING SYSTEMS Speaker: Dennis Hospach, Universität Tübingen, DE Authors: Dennis Hospach¹, Stefan Mueller¹, Wolfgang Rosenstiel¹ and Oliver Bringmann² ¹Universität Tübingen, DE; ²Universität Tübingen / FZI, DE Abstract Recently, optical sensors have become a standard item in modern cars, raising questions with respect to the necessary testing under various ambient effects. In order to achieve a high test coverage of vision-based surround sensing systems, a lot of different environmental conditions need to be tested. Unfortunately, it is by far too time-consuming to build test sets of all relevant environmental conditions by recording real video data. This paper presents a novel approach for ambient-aware virtual prototyping and robustness testing. We propose a method to significantly reduce the needed on-road captures being used for design and validation of vision-based Advanced Driver Assistance Systems (ADAS) and fully automated driving. Our approach facilitates the generation of comparable test sets by using largely reduced amounts of real on-road captures and applying computer-generated variations of falling rain to it in a comprehensive virtual prototyping environment. In combination with the simulation of camera properties, which influence the visual effects of falling rain to a great extent, we are able to generate different rain scenarios under a wide variety of parameters. Our approach has been applied to an automotive lane detection system using a series of multiple rain scenarios. We have explored, how falling rain can influence such a system and how such behavior can be detected using simulated rain scenarios. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-3	PROPOSAL FOR FAST DIRECTIONAL ENERGY INTERCHANGE USED IN MCMC-BASED AUTONOMOUS DECENTRALIZED MECHANISM TOWARD RESILIENT MICROGRID Speaker: Yusuke Sakumoto, Tokyo Metropolitan University, JP Authors: Yusuke Sakumoto¹ and Ittetsu Taniguchi² ¹Tokyo Metropolitan University, JP; ²Ritsumeikan University, JP Abstract Microgrid is well known as key technology to improve renewable energy's ease of use. Some previous works focused on a microgrid that is divided into autonomous electricity subsystems~(AESs) for its reliability and scalability. We have proposed the MCMC-based autonomous decentralized mechanism (ADM) to perform energy interchange between AESs so as to be supply energy appropriately for different energy demands among AESs. In this paper, toward resilient of microgrids, we design a method to realize directional energy interchange in our ADM on the basis of the convection diffusion. We investigate the effectiveness of the proposed method through simulation experiment considering energy shortage and emergency situations. We clarify that the proposed method can fast supply energy from external power grid to a microgrid under energy shortage situation, and can fast gather distributed energy to a specific AES~(e.g., safe shelter) under emergency situation. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-4	GRID-BASED SELF-ALIGNED QUADRUPLE PATTERNING AWARE TWO DIMENSIONAL ROUTING PATTERN Speaker: Atsushi Takahashi, Tokyo Institute of Technology, JP Authors: Takeshi Ihara¹, Toshiyuki Hongo¹, Atsushi Takahashi¹ and Chikaaki Kodama² ¹Tokyo Institute of Technology, JP; ²Toshiba, JP Abstract Self-Aligned Quadruple Patterning (SAQP) is an important manufacturing technique for sub 14 nm technology node. Although various routing algorithms for SAQP have been proposed, it is not easy to find a dense SAQP compliant routing pattern efficiently. Even though a grid for SAQP compliant routing pattern was proposed, it is not easy to find a valid routing pattern on the grid. The routing pattern of SAQP on the grid consists of three types of routing. Among them, third type has turn prohibition constraint on the grid. Typical routing algorithms often fail to find a valid routing for third type. In this paper, SAQP compliant two dimensional routing patterns are found effectively on the grid by finding an optimal valid tertiary pattern. Experiments show that SAQP compliant routing patterns are found efficiently. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-5	PRACTICAL ILP-BASED ROUTING OF STANDARD CELLS Speaker: Rung-Bin Lin, Yuan Ze University, TW Authors: Hsueh-Ju Lu, En-Jang Jang, Ang Lu, Yu Ting Zhang, Yu-He Chang, Chi-Hung Lin and Rung-Bin Lin, Yuan Ze University, TW Abstract This paper proposes a two-stage transistor routing approach that synergizes the merits of channel routing and integer linear programming for CMOS standard cells. It can route 185 cells in 611 seconds. About 21% of cells obtained by our approach have smaller wire length than their handcrafted counterparts. Only 11% of cells use more vias than their handcrafted counterparts. Our router completes routing of many cells that cannot be routed by an industrial one. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-6	A PROCEDURE FOR IMPROVING THE DISTRIBUTION OF CONGESTION IN GLOBAL ROUTING Speaker: Azadeh Davoodi, University of Wisconsin - Madison, US Authors: Daohang Shi, Azadeh Davoodi and Jeffrey Linderoth, University of Wisconsin - Madison, US Abstract This work introduces a procedure which takes as input a global routing solution that is already improved for routability based on the traditional total overﬂow (TOF) metric, and then improves the distribution of congestion without increasing the TOF. Our router is able to signiﬁcantly decrease the number of edges in undesirable ranges of congestion by optimizing a convex piece-wise linear penalty function. The penalties are ﬂexible and may be speciﬁed by the user. In our experiments, using the already-optimized global routing solutions of the ISPD'11 benchmarks—mostly have 0 units of TOF—we show the number of edges which are utilized very close to capacity can be signiﬁcantly reduced. This work is the ﬁrst to explicitly target improving the distribution of edge congestion corresponding to an already-optimized global routing solution without sacriﬁcing the TOF. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-7	(Best Paper Award Candidate) MACHINE LEARNED MACHINES: ADAPTIVE CO-OPTIMIZATION OF CACHES, CORES, AND ON-CHIP NETWORK Speaker: Rahul Jain, Indian Institute of Technology Delhi, IN Authors: Rahul Jain¹, Preeti Ranjan Panda¹ and Sreenivas Subramoney² ¹Indian Institute of Technology Delhi, IN; ²Intel, IN Abstract Abstract—Modern multicore architectures require runtime optimization techniques to address the problem of mismatches between the dynamic resource requirements of different processes and the runtime allocation. Choosing between multiple optimizations at runtime is complex due to the non-additive effects, making the adaptiveness of the machine learning techniques useful. We present a novel method, Machine Learned Machines (MLM), by using Online Reinforcement Learning (RL) to perform dynamic partitioning of the last level cache (LLC), along with dynamic voltage and frequency scaling (DVFS) of the core and uncore (interconnection network and LLC). We show that the co-optimization results in much lower energy-delay product (EDP) than any of the techniques applied individually. The results show an average of 19.6% EDP and 2.6% execution time improvement over the baseline. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-8	IMPROVING PERFORMANCE BY MONITORING WHILE MAINTAINING WORST-CASE GUARANTEES Speaker: Syed Md Jakaria Abdullah, Uppsala University, SE Authors: Syed Md Jakaria Abdullah, Kai Lampka and Wang Yi, Uppsala University, SE Abstract With real-time systems, feasibility analysis is based on worst-case scenarios. At run-time, worst-case situations are often very unlikely to occur. With the system being dimensioned for the worst-case, one faces low resource utilization and implicit loss in performance at run-time. We propose to use run-time monitoring for evaluating the deviation of job releases from their worst-case release bound. This allows us to compute a conservative bound on the future workload. Based on this, we design a scheme for reclaiming computation time, which has been originally allocated for the jobs which are now known to be absent. By organizing the consumption of extra computing time in a dynamic and time-safe manner, we improve the run-time performance of applications and provably maintain the worst-case guarantees for their response times. We evaluate the usefulness of the presented approach by using randomly generated traces of job releases. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-9	FAULT TOLERANT NON-VOLATILE SPINTRONIC FLIP-FLOP Speaker: Rajendra Bishnoi, Karlsruhe Institute of Technology (KIT), DE Authors: Rajendra Bishnoi, Fabian Oboril and Mehdi Tahoori, Karlsruhe Institute of Technology (KIT), DE Abstract With technology down scaling, static power has become one of the biggest challenges in a System-On-Chip. Normally-off computing using non-volatile sequential elements is a promising solution to address this challenge. Recently, many non-volatile shadow flip-flop architectures were introduced, in which Magnetic Tunnel Junction (MTJ) cells are employed as backup storing elements. Due to the emerging fabrication processes of magnetic layers, MTJs are more susceptible to manufacturing defects than their CMOS counterparts. Moreover, unlike memory arrays that can effectively be repaired with well-established memory repair and coding schemes, flip-flops scattered in the layout are more difficult to repair. So, without effective defect and fault tolerance for non-volatile flip-flops, the manufacturing yield will be severely affected. Therefore, we propose a Fault Tolerant Non-Volatile Latch (FTNV-L) design, in which we arrange several MTJ cells in such a way that it is resilient to various MTJ faults. Simulation results show that our proposed FTNV-L can effectively tolerate all single MTJ faults with considerably lower overhead than traditional approaches. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-10	TOWARDS AUTOMATIC DIAGNOSIS OF MINORITY CARRIERS PROPAGATION PROBLEMS IN HV/HT AUTOMOTIVE SMART POWER ICS Speaker: Yasser Moursy, Sorbonne Universités, UPMC, FR Authors: Yasser Moursy¹, Hao Zou¹, Ramy Iskander¹, Pierre Tisserand², Dieu-My Ton², Giuseppe Pasetti³, Ehrenfried Seebacher⁴, Alexander Steinmair⁴, Thomas Gneiting⁵ and Heidrun Alius⁵ ¹Sorbonne Universités, UPMC, FR; ²Valeo, Creteil, FR; ³AMS, Navacchio, IT; ⁴AMS AG, Unterpremstaetten, AT; ⁵AdMOS, Frickenhausen, DE Abstract In this paper, a proposed methodology to identify the substrate coupling effects in smart power integrated circuits is presented. This methodology is based on a tool called AUTOMICS to extract substrate parasitic network. This network comprises diodes and resistors that are able to maintain the continuity of minority carrier concentration. The contribution of minority carriers in the substrate noise is significant in high-voltage and high temperature applications. The proposed methodology along with conventional latch-up problem identification for a test case automotive chip AUTOCHIP1 are presented. The time of the proposed methodology is significantly shorter than the conventional one. The proposed methodology could significantly shorten the time-to-market and ameliorate the robustness of the design. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-12	TOWARDS HIGHLY RELIABLE SRAM-BASED PUFS Speaker: Elena Ioana Vatajelu, Politecnico di Torino, IT Authors: Elena Ioana Vatajelu¹, Giorgio Di Natale² and Paolo Prinetto³ ¹POLITO, IT; ²LIRMM, FR; ³Politecnico di Torino, IT Abstract Physically Unclonable Functions (PUFs) are emerging cryptographic primitives used to implement low-cost device authentication and secure secret key generation. Several solutions exists for classical CMOS devices, the most investigated solutions today for weak PUF implementation are based on the use of SRAMs which offer the advantage of reusing the memories that already exist in many designs. The efficiency of PUF implementations is strongly dependent on the unclonability and reliability of their responses. It has been shown that SRAM PUFs can guarantee high levels of both unclonability and reliability. However, high reliability is today achieved by using Fuzzy extractor structures combined with complex error correcting codes (ECCs) which increase the complexity and cost of the design. The overheads associated with these techniques increases with their error correction capability. In this paper we define an effective method to identify the unreliable cells in the PUF implementation based on SRAM stability test. This information is used to significantly reduce the need for complex ECCs resulting in efficient, low cost PUF implementations. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-13	CURRENT BASED PUF EXPLOITING RANDOM VARIATIONS IN SRAM CELLS Speaker: Fengchao Zhang, University of Florida, US Authors: Fengchao Zhang¹, Shuo Yang¹, Jim Plusquellic² and Swarup Bhunia¹ ¹University of Florida, US; ²University of New Mexico, US Abstract Physical Unclonable Function (PUF) is a security primitive that has been proven to be effective in diverse security solutions ranging from hardware authentication to on-die entropy generation. PUFs can be implemented in a design in two possible ways: (1) adding a separate dedicated circuit; and (2) reusing an existing onchip structure for generating random signatures. A large percentage of existing PUFs falls into the first category, which suffers from the important drawback of often unacceptable hardware and design overhead. Moreover, they cannot be applied to legacy designs, which do not allow insertion of additional circuit structures. Intrinsic PUFs, that rely on pre-existing circuit structures, such as static randomaccess memory (SRAM), fall into the second category. They, however, typically suffer from poor entropy as well as lack of robustness. In this paper, we introduce a novel PUF implementation of the second category that exploits the effect of manufacturing process variations in SRAM read access current. In particular, we note that transistor level variations in SRAM cells cause significant variations in the read current and the variation changes with the stored content in a SRAM cell. We propose a method to transform the analog read current value for an SRAM array into robust binary signatures. The proposed PUF can be easily employed for authentication of commercial SRAM chips without any design modification. Furthermore, it can be realized, with minor hardware modification, into chips with embedded memory, e.g., a processor, for on-die entropy generation. Simulation results at 45nm CMOS process for 1000 chips as well as measurement results based on 30 commercial SRAM chips, show promising randomness, uniqueness and robustness under environmental fluctuations. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-14	BEHAVIORAL MODELING OF TIMING SLACK VARIATION IN DIGITAL CIRCUITS DUE TO POWER SUPPLY NOISE Speaker: Taesik Na, Georgia Institute of Technology, US Authors: Taesik Na and Saibal Mukhopadhyay, Georgia Institute of Technology, US Abstract Timing error due to power supply noise (PSN) is a key challenge for design of digital systems. This paper presents an accurate time-domain behavioral model of timing slack variation due to the PSN while accounting for the clock-data compensation (CDC). The accuracy of the model is verified against SPICE for complex designs including AES engine and LEON3 processor. As a case study, the model is used for time-domain co-simulation of power distribution network (PDN) and LEON3 processor with circuit-based noise tolerance techniques. The analysis shows that the model helps reduce pessimism in estimated timing slack by considering effects of PSN and CDC. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-15	LOSSLESS COMPRESSION ALGORITHM BASED ON DICTIONARY CODING FOR MULTIPLE E-BEAM DIRECT WRITE SYSTEM Speaker: Pei-Chun Lin, National Taiwan University, TW Authors: Pei-Chun Lin, Yu-Hsuan Pai, Yu-Hsiang Chiu, Shao-Yuan Fang and Charlie Chung-Ping Chen, National Taiwan University, TW Abstract Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and decompression algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for decompression. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-16	PHONOCMAP: AN APPLICATION MAPPING TOOL FOR PHOTONIC NETWORKS-ON-CHIP Speaker: Edoardo Fusella, University of Naples Federico II, IT Authors: Edoardo Fusella and Alessandro Cilardo, University of Naples Federico II, IT Abstract While providing a promising solution for high-performance on-chip communication, photonic networks-on-chip suffer from insertion loss and crosstalk noise, which may severely constrain their scalability. In this paper, we introduce a methodology and a related tool, PhoNoCMap, for the design space exploration of optical NoCs mapping solutions, which automatically assigns application tasks to the nodes of a generic photonic NoC architecture such that the worst-case either insertion loss or crosstalk noise are minimized. The experimental results show significant benefits in terms of insertion loss and crosstalk noise, allowing improved network scalability. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-17	DESIGN OF AN EFFICIENT READY QUEUE FOR EARLIEST-DEADLINE-FIRST (EDF) SCHEDULER Speaker and Author: Risat Mahmud Pathan, Chalmers University of Technology, SE Abstract Although dynamic-priority-based EDF algorithm is known to be theoretically optimal for scheduling sporadic real-time tasks on uniprocessor, fixed-priority (FP) scheduling is mostly used in practice. One of the main reasons for FP scheduling being popular in the industry is its efficient implementation: operations on the ready queue can be done in constant time. On the other hand, ready queue of EDF scheduler is generally implemented as a priority queue, for example, using a binary min-heap data structure in which (insertion/deletion) operation cannot be done in constant time. This paper proposes a new design of ready queue for EDF scheduler: a simple data structure for the ready queue and efficient operations to insert and remove task control blocks (TCBs) to and from the ready queue are proposed. Insertion of a TCB of a newly released job (that cannot preempt the currently-executing job) is done in non-constant time. However, insertion of a TCB of a preempted job or the removal of the TCB of job having the highest EDF priority from the ready queue can be done in constant time. Simulation using randomly generated task sets shows that the overhead of managing jobs in our proposed ready queue for EDF scheduler is significantly lower than that of other approaches. We believe that theoretically optimal EDF algorithm implemented based on our proposed ready-queue data structure will make EDF popular in industry. Download Paper (PDF; Only available from the DATE venue WiFi)
IP1-18	RT LEVEL TIMING MODELING FOR AGING PREDICTION Speaker: Nils Koppaetzky, OFFIS Institute for Information Technology, DE Authors: Nils Koppaetzky¹, Malte Metzdorf¹, Reef Eilers¹, Domenik Helms¹ and Wolfgang Nebel² ¹OFFIS Institute for Information Technology, DE; ²University of Oldenburg and OFFIS, DE Abstract The simulation of aging related degradation mech- anisms is a challenging task for timing and reliability estimations during all design phases of digital systems. Some good approaches towards accurate, efﬁcient and applicable timing models at the register transfer level (RTL) have already been made. However recent state-of-the-art models often have to access lower levels of abstraction, such as the underlying gate-level netlist for each timing estimation and require to repeat every analyzing step if parameters, input signals or designs are changed. This work introduces a new RTL timing model concept that provides a separation of design analysis and aging estimation. It allows more efﬁcient design evaluations with respect to aging. Although this is work in progress and systematic evaluations are still ongoing, early results indicate the applicability and capability of the approach to compete with recent models both in accuracy and efﬁciency. Download Paper (PDF; Only available from the DATE venue WiFi)

Label

Presentation Title
Authors

IP1-1

A SCALABLE LANE DETECTION ALGORITHM ON COTSS WITH OPENCL
Speaker:
Kai Huang, Sun Yat-Sen University, CN
Authors:
Kai Huang¹, Biao Hu², Jan Botsch³, Nikhil Madduri³ and Alois Knoll³
¹Sun Yat-Sen University, CN; ²Technische Universität München (TUM), DE; ³Technische Universität München (TUM), DE
Abstract
Road lane detection are classical requirements for advanced driving assistant systems. With new computer technologies, lane detection algorithms can be exploited on Cots platforms. This paper investigates the use of OpenCL and develop a particle- filter based lane detection algorithm that can tune the trade-off between detection accuracy and speed. Our algorithm is tested on 14 video streams from different data-sets with different scenarios on different Cots hardware. With an average deviation fewer than 5 pixels, the average frame rates for the 14 videos can reach about 400 fps on both Gpu and Fpga. The peak frame rates for certain videos on GPU can reach almost 1000 fps.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-2

SIMULATION OF FALLING RAIN FOR ROBUSTNESS TESTING OF VIDEO-BASED SURROUND SENSING SYSTEMS
Speaker:
Dennis Hospach, Universität Tübingen, DE
Authors:
Dennis Hospach¹, Stefan Mueller¹, Wolfgang Rosenstiel¹ and Oliver Bringmann²
¹Universität Tübingen, DE; ²Universität Tübingen / FZI, DE
Abstract
Recently, optical sensors have become a standard item in modern cars, raising questions with respect to the necessary testing under various ambient effects. In order to achieve a high test coverage of vision-based surround sensing systems, a lot of different environmental conditions need to be tested. Unfortunately, it is by far too time-consuming to build test sets of all relevant environmental conditions by recording real video data. This paper presents a novel approach for ambient-aware virtual prototyping and robustness testing. We propose a method to significantly reduce the needed on-road captures being used for design and validation of vision-based Advanced Driver Assistance Systems (ADAS) and fully automated driving. Our approach facilitates the generation of comparable test sets by using largely reduced amounts of real on-road captures and applying computer-generated variations of falling rain to it in a comprehensive virtual prototyping environment. In combination with the simulation of camera properties, which influence the visual effects of falling rain to a great extent, we are able to generate different rain scenarios under a wide variety of parameters. Our approach has been applied to an automotive lane detection system using a series of multiple rain scenarios. We have explored, how falling rain can influence such a system and how such behavior can be detected using simulated rain scenarios.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-3

PROPOSAL FOR FAST DIRECTIONAL ENERGY INTERCHANGE USED IN MCMC-BASED AUTONOMOUS DECENTRALIZED MECHANISM TOWARD RESILIENT MICROGRID
Speaker:
Yusuke Sakumoto, Tokyo Metropolitan University, JP
Authors:
Yusuke Sakumoto¹ and Ittetsu Taniguchi²
¹Tokyo Metropolitan University, JP; ²Ritsumeikan University, JP
Abstract
Microgrid is well known as key technology to improve renewable energy's ease of use. Some previous works focused on a microgrid that is divided into autonomous electricity subsystems~(AESs) for its reliability and scalability. We have proposed the MCMC-based autonomous decentralized mechanism (ADM) to perform energy interchange between AESs so as to be supply energy appropriately for different energy demands among AESs. In this paper, toward resilient of microgrids, we design a method to realize directional energy interchange in our ADM on the basis of the convection diffusion. We investigate the effectiveness of the proposed method through simulation experiment considering energy shortage and emergency situations. We clarify that the proposed method can fast supply energy from external power grid to a microgrid under energy shortage situation, and can fast gather distributed energy to a specific AES~(e.g., safe shelter) under emergency situation.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-4

GRID-BASED SELF-ALIGNED QUADRUPLE PATTERNING AWARE TWO DIMENSIONAL ROUTING PATTERN
Speaker:
Atsushi Takahashi, Tokyo Institute of Technology, JP
Authors:
Takeshi Ihara¹, Toshiyuki Hongo¹, Atsushi Takahashi¹ and Chikaaki Kodama²
¹Tokyo Institute of Technology, JP; ²Toshiba, JP
Abstract
Self-Aligned Quadruple Patterning (SAQP) is an important manufacturing technique for sub 14 nm technology node. Although various routing algorithms for SAQP have been proposed, it is not easy to find a dense SAQP compliant routing pattern efficiently. Even though a grid for SAQP compliant routing pattern was proposed, it is not easy to find a valid routing pattern on the grid. The routing pattern of SAQP on the grid consists of three types of routing. Among them, third type has turn prohibition constraint on the grid. Typical routing algorithms often fail to find a valid routing for third type. In this paper, SAQP compliant two dimensional routing patterns are found effectively on the grid by finding an optimal valid tertiary pattern. Experiments show that SAQP compliant routing patterns are found efficiently.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-5

PRACTICAL ILP-BASED ROUTING OF STANDARD CELLS
Speaker:
Rung-Bin Lin, Yuan Ze University, TW
Authors:
Hsueh-Ju Lu, En-Jang Jang, Ang Lu, Yu Ting Zhang, Yu-He Chang, Chi-Hung Lin and Rung-Bin Lin, Yuan Ze University, TW
Abstract
This paper proposes a two-stage transistor routing approach that synergizes the merits of channel routing and integer linear programming for CMOS standard cells. It can route 185 cells in 611 seconds. About 21% of cells obtained by our approach have smaller wire length than their handcrafted counterparts. Only 11% of cells use more vias than their handcrafted counterparts. Our router completes routing of many cells that cannot be routed by an industrial one.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-6

A PROCEDURE FOR IMPROVING THE DISTRIBUTION OF CONGESTION IN GLOBAL ROUTING
Speaker:
Azadeh Davoodi, University of Wisconsin - Madison, US
Authors:
Daohang Shi, Azadeh Davoodi and Jeffrey Linderoth, University of Wisconsin - Madison, US
Abstract
This work introduces a procedure which takes as input a global routing solution that is already improved for routability based on the traditional total overﬂow (TOF) metric, and then improves the distribution of congestion without increasing the TOF. Our router is able to signiﬁcantly decrease the number of edges in undesirable ranges of congestion by optimizing a convex piece-wise linear penalty function. The penalties are ﬂexible and may be speciﬁed by the user. In our experiments, using the already-optimized global routing solutions of the ISPD'11 benchmarks—mostly have 0 units of TOF—we show the number of edges which are utilized very close to capacity can be signiﬁcantly reduced. This work is the ﬁrst to explicitly target improving the distribution of edge congestion corresponding to an already-optimized global routing solution without sacriﬁcing the TOF.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-7

(Best Paper Award Candidate)
MACHINE LEARNED MACHINES: ADAPTIVE CO-OPTIMIZATION OF CACHES, CORES, AND ON-CHIP NETWORK
Speaker:
Rahul Jain, Indian Institute of Technology Delhi, IN
Authors:
Rahul Jain¹, Preeti Ranjan Panda¹ and Sreenivas Subramoney²
¹Indian Institute of Technology Delhi, IN; ²Intel, IN
Abstract
Abstract—Modern multicore architectures require runtime optimization techniques to address the problem of mismatches between the dynamic resource requirements of different processes and the runtime allocation. Choosing between multiple optimizations at runtime is complex due to the non-additive effects, making the adaptiveness of the machine learning techniques useful. We present a novel method, Machine Learned Machines (MLM), by using Online Reinforcement Learning (RL) to perform dynamic partitioning of the last level cache (LLC), along with dynamic voltage and frequency scaling (DVFS) of the core and uncore (interconnection network and LLC). We show that the co-optimization results in much lower energy-delay product (EDP) than any of the techniques applied individually. The results show an average of 19.6% EDP and 2.6% execution time improvement over the baseline.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-8

IMPROVING PERFORMANCE BY MONITORING WHILE MAINTAINING WORST-CASE GUARANTEES
Speaker:
Syed Md Jakaria Abdullah, Uppsala University, SE
Authors:
Syed Md Jakaria Abdullah, Kai Lampka and Wang Yi, Uppsala University, SE
Abstract
With real-time systems, feasibility analysis is based on worst-case scenarios. At run-time, worst-case situations are often very unlikely to occur. With the system being dimensioned for the worst-case, one faces low resource utilization and implicit loss in performance at run-time. We propose to use run-time monitoring for evaluating the deviation of job releases from their worst-case release bound. This allows us to compute a conservative bound on the future workload. Based on this, we design a scheme for reclaiming computation time, which has been originally allocated for the jobs which are now known to be absent. By organizing the consumption of extra computing time in a dynamic and time-safe manner, we improve the run-time performance of applications and provably maintain the worst-case guarantees for their response times. We evaluate the usefulness of the presented approach by using randomly generated traces of job releases.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-9

FAULT TOLERANT NON-VOLATILE SPINTRONIC FLIP-FLOP
Speaker:
Rajendra Bishnoi, Karlsruhe Institute of Technology (KIT), DE
Authors:
Rajendra Bishnoi, Fabian Oboril and Mehdi Tahoori, Karlsruhe Institute of Technology (KIT), DE
Abstract
With technology down scaling, static power has become one of the biggest challenges in a System-On-Chip. Normally-off computing using non-volatile sequential elements is a promising solution to address this challenge. Recently, many non-volatile shadow flip-flop architectures were introduced, in which Magnetic Tunnel Junction (MTJ) cells are employed as backup storing elements. Due to the emerging fabrication processes of magnetic layers, MTJs are more susceptible to manufacturing defects than their CMOS counterparts. Moreover, unlike memory arrays that can effectively be repaired with well-established memory repair and coding schemes, flip-flops scattered in the layout are more difficult to repair. So, without effective defect and fault tolerance for non-volatile flip-flops, the manufacturing yield will be severely affected. Therefore, we propose a Fault Tolerant Non-Volatile Latch (FTNV-L) design, in which we arrange several MTJ cells in such a way that it is resilient to various MTJ faults. Simulation results show that our proposed FTNV-L can effectively tolerate all single MTJ faults with considerably lower overhead than traditional approaches.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-10

TOWARDS AUTOMATIC DIAGNOSIS OF MINORITY CARRIERS PROPAGATION PROBLEMS IN HV/HT AUTOMOTIVE SMART POWER ICS
Speaker:
Yasser Moursy, Sorbonne Universités, UPMC, FR
Authors:
Yasser Moursy¹, Hao Zou¹, Ramy Iskander¹, Pierre Tisserand², Dieu-My Ton², Giuseppe Pasetti³, Ehrenfried Seebacher⁴, Alexander Steinmair⁴, Thomas Gneiting⁵ and Heidrun Alius⁵
¹Sorbonne Universités, UPMC, FR; ²Valeo, Creteil, FR; ³AMS, Navacchio, IT; ⁴AMS AG, Unterpremstaetten, AT; ⁵AdMOS, Frickenhausen, DE
Abstract
In this paper, a proposed methodology to identify the substrate coupling effects in smart power integrated circuits is presented. This methodology is based on a tool called AUTOMICS to extract substrate parasitic network. This network comprises diodes and resistors that are able to maintain the continuity of minority carrier concentration. The contribution of minority carriers in the substrate noise is significant in high-voltage and high temperature applications. The proposed methodology along with conventional latch-up problem identification for a test case automotive chip AUTOCHIP1 are presented. The time of the proposed methodology is significantly shorter than the conventional one. The proposed methodology could significantly shorten the time-to-market and ameliorate the robustness of the design.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-12

TOWARDS HIGHLY RELIABLE SRAM-BASED PUFS
Speaker:
Elena Ioana Vatajelu, Politecnico di Torino, IT
Authors:
Elena Ioana Vatajelu¹, Giorgio Di Natale² and Paolo Prinetto³
¹POLITO, IT; ²LIRMM, FR; ³Politecnico di Torino, IT
Abstract
Physically Unclonable Functions (PUFs) are emerging cryptographic primitives used to implement low-cost device authentication and secure secret key generation. Several solutions exists for classical CMOS devices, the most investigated solutions today for weak PUF implementation are based on the use of SRAMs which offer the advantage of reusing the memories that already exist in many designs. The efficiency of PUF implementations is strongly dependent on the unclonability and reliability of their responses. It has been shown that SRAM PUFs can guarantee high levels of both unclonability and reliability. However, high reliability is today achieved by using Fuzzy extractor structures combined with complex error correcting codes (ECCs) which increase the complexity and cost of the design. The overheads associated with these techniques increases with their error correction capability. In this paper we define an effective method to identify the unreliable cells in the PUF implementation based on SRAM stability test. This information is used to significantly reduce the need for complex ECCs resulting in efficient, low cost PUF implementations.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-13

CURRENT BASED PUF EXPLOITING RANDOM VARIATIONS IN SRAM CELLS
Speaker:
Fengchao Zhang, University of Florida, US
Authors:
Fengchao Zhang¹, Shuo Yang¹, Jim Plusquellic² and Swarup Bhunia¹
¹University of Florida, US; ²University of New Mexico, US
Abstract
Physical Unclonable Function (PUF) is a security primitive that has been proven to be effective in diverse security solutions ranging from hardware authentication to on-die entropy generation. PUFs can be implemented in a design in two possible ways: (1) adding a separate dedicated circuit; and (2) reusing an existing onchip structure for generating random signatures. A large percentage of existing PUFs falls into the first category, which suffers from the important drawback of often unacceptable hardware and design overhead. Moreover, they cannot be applied to legacy designs, which do not allow insertion of additional circuit structures. Intrinsic PUFs, that rely on pre-existing circuit structures, such as static randomaccess memory (SRAM), fall into the second category. They, however, typically suffer from poor entropy as well as lack of robustness. In this paper, we introduce a novel PUF implementation of the second category that exploits the effect of manufacturing process variations in SRAM read access current. In particular, we note that transistor level variations in SRAM cells cause significant variations in the read current and the variation changes with the stored content in a SRAM cell. We propose a method to transform the analog read current value for an SRAM array into robust binary signatures. The proposed PUF can be easily employed for authentication of commercial SRAM chips without any design modification. Furthermore, it can be realized, with minor hardware modification, into chips with embedded memory, e.g., a processor, for on-die entropy generation. Simulation results at 45nm CMOS process for 1000 chips as well as measurement results based on 30 commercial SRAM chips, show promising randomness, uniqueness and robustness under environmental fluctuations.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-14

BEHAVIORAL MODELING OF TIMING SLACK VARIATION IN DIGITAL CIRCUITS DUE TO POWER SUPPLY NOISE
Speaker:
Taesik Na, Georgia Institute of Technology, US
Authors:
Taesik Na and Saibal Mukhopadhyay, Georgia Institute of Technology, US
Abstract
Timing error due to power supply noise (PSN) is a key challenge for design of digital systems. This paper presents an accurate time-domain behavioral model of timing slack variation due to the PSN while accounting for the clock-data compensation (CDC). The accuracy of the model is verified against SPICE for complex designs including AES engine and LEON3 processor. As a case study, the model is used for time-domain co-simulation of power distribution network (PDN) and LEON3 processor with circuit-based noise tolerance techniques. The analysis shows that the model helps reduce pessimism in estimated timing slack by considering effects of PSN and CDC.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-15

LOSSLESS COMPRESSION ALGORITHM BASED ON DICTIONARY CODING FOR MULTIPLE E-BEAM DIRECT WRITE SYSTEM
Speaker:
Pei-Chun Lin, National Taiwan University, TW
Authors:
Pei-Chun Lin, Yu-Hsuan Pai, Yu-Hsiang Chiu, Shao-Yuan Fang and Charlie Chung-Ping Chen, National Taiwan University, TW
Abstract
Electron-beam direct-write (EBDW) lithography is an attractive candidate of next-generation lithography in advanced semiconductor processes. The huge data stream bandwidth required for the data delivery path in EBDW systems could seriously deteriorate throughput, which is one of the major deficiencies constraining EBDW lithography from mass production. A lossless electron-beam layout data compression and decompression algorithm is proposed in this paper for 5-bit gray level bitmaps. Compared with the state-of-the-art LineDiff Entropy algorithm, the proposed method averagely improves the compression rate by 18% and achieves more than 7.5 times speedup for decompression.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-16

PHONOCMAP: AN APPLICATION MAPPING TOOL FOR PHOTONIC NETWORKS-ON-CHIP
Speaker:
Edoardo Fusella, University of Naples Federico II, IT
Authors:
Edoardo Fusella and Alessandro Cilardo, University of Naples Federico II, IT
Abstract
While providing a promising solution for high-performance on-chip communication, photonic networks-on-chip suffer from insertion loss and crosstalk noise, which may severely constrain their scalability. In this paper, we introduce a methodology and a related tool, PhoNoCMap, for the design space exploration of optical NoCs mapping solutions, which automatically assigns application tasks to the nodes of a generic photonic NoC architecture such that the worst-case either insertion loss or crosstalk noise are minimized. The experimental results show significant benefits in terms of insertion loss and crosstalk noise, allowing improved network scalability.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-17

DESIGN OF AN EFFICIENT READY QUEUE FOR EARLIEST-DEADLINE-FIRST (EDF) SCHEDULER
Speaker and Author:
Risat Mahmud Pathan, Chalmers University of Technology, SE
Abstract
Although dynamic-priority-based EDF algorithm is known to be theoretically optimal for scheduling sporadic real-time tasks on uniprocessor, fixed-priority (FP) scheduling is mostly used in practice. One of the main reasons for FP scheduling being popular in the industry is its efficient implementation: operations on the ready queue can be done in constant time. On the other hand, ready queue of EDF scheduler is generally implemented as a priority queue, for example, using a binary min-heap data structure in which (insertion/deletion) operation cannot be done in constant time. This paper proposes a new design of ready queue for EDF scheduler: a simple data structure for the ready queue and efficient operations to insert and remove task control blocks (TCBs) to and from the ready queue are proposed. Insertion of a TCB of a newly released job (that cannot preempt the currently-executing job) is done in non-constant time. However, insertion of a TCB of a preempted job or the removal of the TCB of job having the highest EDF priority from the ready queue can be done in constant time. Simulation using randomly generated task sets shows that the overhead of managing jobs in our proposed ready queue for EDF scheduler is significantly lower than that of other approaches. We believe that theoretically optimal EDF algorithm implemented based on our proposed ready-queue data structure will make EDF popular in industry.
Download Paper (PDF; Only available from the DATE venue WiFi)

IP1-18

RT LEVEL TIMING MODELING FOR AGING PREDICTION
Speaker:
Nils Koppaetzky, OFFIS Institute for Information Technology, DE
Authors:
Nils Koppaetzky¹, Malte Metzdorf¹, Reef Eilers¹, Domenik Helms¹ and Wolfgang Nebel²
¹OFFIS Institute for Information Technology, DE; ²University of Oldenburg and OFFIS, DE
Abstract
The simulation of aging related degradation mech- anisms is a challenging task for timing and reliability estimations during all design phases of digital systems. Some good approaches towards accurate, efﬁcient and applicable timing models at the register transfer level (RTL) have already been made. However recent state-of-the-art models often have to access lower levels of abstraction, such as the underlying gate-level netlist for each timing estimation and require to repeat every analyzing step if parameters, input signals or designs are changed. This work introduces a new RTL timing model concept that provides a separation of design analysis and aging estimation. It allows more efﬁcient design evaluations with respect to aging. Although this is work in progress and systematic evaluations are still ongoing, early results indicate the applicability and capability of the approach to compete with recent models both in accuracy and efﬁciency.
Download Paper (PDF; Only available from the DATE venue WiFi)

Visit us at DATE 2016