ET Exhibition Theatre
In addition to the conference programme, there will be 10 Exhibition Workshops as part of the exhibition. These workshops will feature technical presentations on the state-of-the-art in our industry, tutorials, a selection of special sessions from the conference and as a special highlight an Exhibition Theatre Keynote. The theatre is located next to the exhibition hall, close to the booths and the rooms of the technical conference.
The Exhibition Theatre sessions are open to conference delegates as well as to exhibition visitors.
Extensions and further details of this programme will be published on the DATE web portal and complemented regularly. Before your DATE attendance, please visit the web portal for an update.
ET3.8 Solutions for AI on Chip using Neuromorphic Hardware, for AI from Edge to Cloud and for Power-Efficiency
At DATE 2020 Exhibition Theatre leading experts provide attendees with their advice on the latest technologies in the field, covering applications as well as solutions for the design process. In this session Intel and Andes Technology will cover the implementation of AI highlighting neuromorphic hardware, RISC-V and AI from edge to cloud. Dolphin Design will show how to speed up the design of the required power-efficient SoC.
ET3.8.1 AI on Chip: Perception, Learning, and Control in Neuromorphic Hardware
Today, Artificial Intelligent systems are dominated by deep neuronal networks that learn to solve tasks from data. The DNNs have replaced computer vision architectures with hand-crafted features and have revolutionised data and signal processing. In order to train and run DNN architectures efficiently, specialised hardware accelerators are developed. One type of these accelerators is neuromorphic hardware, originally developed to emulate behaviour of biological neurons using electrical circuits. Modern neuromorphic devices such as Intel’s Loihi research chip directly execute spiking neuronal networks and often include plasticity — the ability of network connections to change on the fly based on local activity in the network. This hardware promises a new computing framework that goes beyond deep learning. These new computing framework features ultrafast event-based inference and one-shot learning — key capabilities to deploy DNNs in low-latency applications in dynamic environments. This talk will show how neuromorphic hardware can be used to solve robotic tasks.
ET3.8.2 AI from Edge to Cloud: Leveraging RISC-V with DSP, Vector and Custom Instructions
In this talk, Andes Technology will present RISC-V processors for applications ranging from very compact, low power cores used in Sensors to mid-ranged cores in running protocol stacks and doing high-speed control, and number-crunching cores to process high-volume data in parallel. Those highly-configurable AndesCores™ with extensibility and modularity inherited from RISC-V allow designers to use one ISA for all of the workloads. They are also adopted by AI SoC’s with applications from edge to cloud. We will provide an overview of the RISC-V DSP extension for low-data volume workloads like Keyword Spotting and Face detection with low power. For higher data throughput applications, we will introduce the industry-first commercial RISC-V Vector Processor solution and how it can be used to speed up compute-intensive applications. Last but not least, one of RISC-V’s strength is to allow well-defined custom instruction extensions to fulfill Domain Specific Acceleration (DSA) without breaking ISA compatibility. In the end, we will also cover Andes Custom Extensions™, an automation framework to bring DSA capability to the hands of every designer instead of limiting it to just CPU experts.
ET3.8.3 PMU design in weeks, not months: the need for SPEED
Energy-efficiency has now replaced low-power as one of the biggest challenges that the semiconductor industry is facing. All vertical market segments are calling for more power-efficient applications, driven by a global need to reduce our environmental footprint and make the best use of energy sources. The emergence of smart cities, smart homes and smart buildings, enabled by billions of battery-operated IoT devices connected to data centers, will force the semiconductor industry to adopt disruptive approaches to improve the energy-efficiency of both edge and cloud devices.
When it comes to IC design, the traditional approaches for power reduction were mainly driven by Moore’s law and are now suffering from its slowdown, pushing SoC design teams to use more and more advanced SoC architecture and complex design techniques to overcome the fact that technology scaling is not sufficient anymore. As a consequence, the SoC complexity required to demonstrate the best energy-efficiency figures results in longer design cycles, higher development costs and additional risks.
Leveraging its SPEED Platform that reduces the PMU design time from months to weeks, Dolphin Design provides a turnkey solution to speed-up and secure the design of advanced power management solutions from SoC architecture to implementation. Energy-efficiency and low power designs are part of Dolphin’s DNA since its inception. In this presentation we will present how we work hand-in-hand with our customers to simplify the design of power-efficient SoC, allowing them to focus on their core competencies and added value.
ET4.8 Solutions for SiP Implementation, In-System Test and NoC/SoC Test
At DATE 2020 Exhibition Theatre leading experts provide attendees with their advice on the latest technologies in the field, covering applications as well as solutions for the design process. In this session Mentor, a Siemens Business, ATOS and Zuken will cover in-system test for automotive, test of scalable NoC/SoC and a co-design environment for SiP implementation.
ET4.8.1 Implementing an Automotive In-System Test Solution
Ensuring vehicle electronics reliability levels as mandated by the ISO 26262 standard requires periodic testing during functional operation. The Tessent MissionMode architecture provides system-level access to all on-chip test resources for key-on, key-off and runtime testing. This presentation will walk through the flow for implementing a chip-level architecture incorporating the MissionMode solution integrated with both logic BIST and Memory BIST capabilities.
ET4.8.2 Scalable NoC, SoC and associated Testbench generation using Defacto STAR
As part of the Mont Blanc 2020, European scalable, modular and power efficient HPC processor, ATOS designs and implement a NoC which includes NoC Xpoints, Protocol agents and system cache.
Our Network on Chip (NoC) is based on basic Xpoint modules which are connected to each other to make a scalable NoC. Each Xpoint module has :
- 4 internal CHI Interface (1 per direction) where all the Xpoint modules are connected to
- 2 End Points CHI Interface which are the entry/exit points of IPs on the System on Chip (Soc)
A CHI interface contains 4 channels interfaces: Request, Data, Snoop and Response. Each channel is fully configurable in each direction and is implemented with Configurable System Verilog Interface. This makes a lot of parameters to handle as we plan to implement an 8x8 NoC which includes 64 Xpoint modules with corresponding parameters set accordingly.
Defacto STAR tool is used to efficiently:
- instantiate all the Xpoint modules with corresponding parameters
- connect all the channels with corresponding System Verilog Interface
- connect the Error, status and configuration interfaces
- connect Protocol Agent on End Point interface (internally)
- create NoC entity.
The main benefits to choose Defacto STAR is
- NoC configuration change and RTL generation in 15 s
- No need to develop our own tools
NoC module will then be integrated at SoC level and connected to IPs delivered by Third-parties. We also use Defacto STAR tool to generate the SoC RTL and associated Testbench.
ET4.8.3 Quick decision of System In Package implementation for IoT/5G era
The increasing complexity of system on chips (SoCs) combined with a new generation of designs that combine multiple chips in a single package (Sip) is creating new challenges in the design of IC packages, printed circuit boards (PCBs) and integrated circuits (ICs). The process typically involves three independent design processes – chip, package and PCB – carried out with point tools whose interface requires time-consuming manual processes that are error-prone and limit the potential for reuse. This challenge is being addressed by a new integrated 3D chip/package/board co-design environment that makes it possible to take quick decision of the best SiP implementation by considering the system-level impact of each design decision, especially for optimizing. The new co-design approach enables netlist management to follow up design modification including die partitioning and seamless electrical characteristic verification during the design. The end result is higher performance and improved quality for smart systems, MEMS and IoT applications.
5.8 Special Session: HLS for AI HW
ET6.8 Solutions for EDA Design Environments
At DATE 2020 Exhibition Theatre leading experts provide attendees with their advice on the latest technologies in the field, covering applications as well as solutions for the design process. In this session Altair and SEMI/ESDA will cover design environments and IP enabling for different levels of abstraction and multi-physics simulations, as well as the Heterogeneous Integration Roadmap (HIR) for connecting design, manufacturing and assembly.
ET6.8.1 Future Vision of Altair for EDA Applications
Today the design of EDA applications are not only focused on hardware/software parts. In many cases as in mechatronic, powertrain and control systems, the environment has to be used with the design itself at different level of abstraction. Altair is providing environments which now help users to interact with dedicated solvers and to handle multi-physics simulations.
ET6.8.2 Saving Serious Money with License First Scheduling
Often seen as a minor detail in job scheduling, we present an alternative view where we treat software licenses as the primary consideration in job dispatch. Through some innovative techniques we will show how license utilization can be doubled with real world examples.
ET6.8.3 Connecting Design, Manufacturing and Assembly in the Moore’s Law 2.0 Era
As device scaling predicted by Moore’s Law becomes more difficult and costly, designers are looking towards new solutions to deliver increasing system level functionality and performance along with lower power and cost. The International Technology Roadmap for Semiconductors (ITRS) served as a designer’s guide to upcoming technologies for many years until it’s retirement in 2016. The Heterogeneous Integration Roadmap (HIR) provides a new guideline for system level integration for the coming decades. This includes new technologies which will have an impact on the tradeoffs facing designers. In addition, the increasing use of silicon in products and applications with long lifetimes and critical safety requirements suggests that long term process effects can no longer be safely ignored. All of this requires increased communication amongst all aspects of system design, manufacture, and assembly as we move towards Moore 2.0.
ET7.8 SystemC-based virtual prototyping: from SoC modeling to the digital twin revolution
SystemC-based virtual prototyping has been adopted and deployed for several years in the semiconductor industry, to implement the shift-left paradigm. While interest has been long focused on SoC modeling, the trends are now to extend the modeling activities to the next level, as part of the digital twin revolution. In this session, the multiple benefits of this approach are discussed, as well as the upcoming challenges, both from an industrial and an academic perspective.
ET7.8.1 Virtual Twins: Modeling trends and challenges ahead
ET7.8.2 The TLM methodology: a swiss knife for studying HW/SW interactions and a gold mine for research topics
ET7.8.3 SystemC-based simulation of industrial manufacturing control systems
ET8.8 MathWorks Tutorial
9.8 Special Session: Panel: Variation-aware analyzes of Mega-MOSFET Memories, Challenges and Solutions
Designing large memories under manufacturing variability requires statistical approaches that rely on SPICE simulations at different Process, Voltage, Temperature operating points to verify that yield requirements will be met. Variation-aware simulations of full memories that consist of millions of transistors is a challenging task for both SPICE simulators and statistical methodology to achieve accurate results. The ideal solution for variation-aware verifications of full memories would be to run Monte Carlo simulations through SPICE simulators to assess that all the addressable elements enable successful write and read operations. However, this classical approach suffers from practical issues and prevent it to be used. Indeed, for large memory arrays (e.g. MB and more) the number of SPICE simulations to perform would be intractable to achieve a descent statistical precision. Moreover, the SPICE simulation of a single sample of the full-memory netlist that involve millions or billions of MOSFETs and parasitic elements might be very long or impossible because of the netlist size. Unfortunately, Fast-SPICE simulations are not a palatable solution for final verification because the loss of accuracy compared to pure SPICE simulations is difficult to evaluate for such netlists. So far, most of the variation-aware methodologies to analyze and validate Mega-MOSFETs memories rely on the assumption that the sub-blocks of the system (e.g. control unit, IOs, row decoders, column circuitries, memory cells) might be assessed independently. Doing so memory designers apply dedicated statistical approaches for each individual sub-block to reduce the overall simulation time to achieve variation-aware closure. When considering that each element of the memory is independent of its neighborhood, the simulation of the memory is drastically reduced to few MOSFETs on the critical paths (longest paths for read or write memory operation), the other sub-blocks being idealized and estimations being derived under Gaussian assumption. Using such an approach, memory designers avoid the usual statistical simulations of the full memory that is, most of the time, unpractical in terms of duration and load. Although the aforementioned approach has been widely used by memory designers, these methods reach their limits when designing memory for low-power and advanced-node technologies where non idealities arise. The consequence of less reliable results is that the memory designers compensate by increasing security margins at the expense of performances to achieve satisfactory yield. In this context sub-blocks can no longer be considered individually and Gaussianity no longer prevails, other practical simulation flows are required to verify full memories with satisfying performances. New statistical approaches and simulation flows must handle memory slices or critical paths with all relevant sub-blocks in order to consider element interactions to be more realistic. Additionally, these approaches must handle the hierarchy of the memory to respect variation ranges of each sub-block, from low sigma for control units and IOs to high sigma for highly replicated blocks. Using a virtual reconstruction of the full memory the yield can be asserted without relying on the assumptions of individual sub-block analyzes. With accurate estimation over the full memory, no more security margins are required, and better performances will be reached."
- Yves Laplanche, ARM, FR
- Lorenzo Ciampolini, CEA, FR
- Pierre Faubet, SILVACO FRANCE, FR
ET10.8 Exhibition Theatre Keynote
As special highlight DATE 2020 Exhibition Theatre features an Exhibition Theatre Keynote providing everybody involved in the design of microelectronics products and applications with very valuable advice and with deep insight into the latest challenges addressed by the world-wide market leader STMicroelectronics.
ET 10.8.1 Design-in-the-Cloud: Myth and Reality
The Cloud is promising orders of magnitude savings in time to market for integrated circuits, owing to CPU elasticity. However, practical limitations still mandate a selective approach to the product design flow and the business models have yet to be fully defined by vendors. The economic equation of designing in the cloud is challenging. In addition, design houses, IDMs and OEM customers have to decide to what extent they want to rely on Cloud service providers to maintain the confidentiality of their IP or SOC databases and honor export control requirements, in a context where such concerns are increasingly relevant in EDA vendor and IC supplier selection. This keynote will explore those topics based on ST’s own experience and trials.
11.8 Special Session: Self-aware, biologically-inspired adaptive hardware systems for ultimate dependability and longevity
State-of-the-art electronic design allows the integration of complex electronic systems comprising thousands of high-level functions on a single chip. This has become possible and feasible because of the combination of atomic-scale semiconductor technology allowing VLSI of billions of transistors, and EDA tools that can handle their useful application and integration by following strictly hierarchical design methodology. This results in many layers of abstraction within a system that makes it implementable, verifiable and, ultimately, explainable. However, while many layers of abstraction maximise the likelihood of a system to function correctly, this can prevent a design from making full use of the capabilities of current technology. Making systems brittle at a time where NoC- and SoC-based implementations are the only way to increase compute capabilities as clock speed limits are reached, devices are affected by variability and ageing, and heat-dissipation limits impose "dark silicon" constraints. Design challenges of electronic systems are no longer driven by making designs smaller but by creating systems that are ultra-low power, resilient and autonomous in their adaptation to anomalies including faults, timing violations and performance degradation. This gives rise to the idea of self-aware hardware, capable of adaptive behaviours or features taking inspiration from, e.g., biological systems, learning algorithms, factory processes. The challenge is to adopt and implement these concepts while achieving a "next- generation" kind of electronic system which is considered at least as useful and trustworthy as its "classical" counterpart—plus additional essential features for future system design and operation. The goal of this Special Session is to present research from world-leading experts addressing state-of-the-art techniques and devices demonstrating the efficacy of concepts of self-awareness, adaptivity and bio-inspiration in the context of real-world hardware systems and applications with a focus on autonomous resource management at runtime, robustness and performance, and new computing architecture in embedded hardware systems."
|14:00||11.8.1||EMBEDDED SOCIAL INSECT-INSPIRED INTELLIGENCE NETWORKS FOR SYSTEM-LEVEL RUNTIME MANAGEMENT
Matthew R. P. Rowlings, University of York, GB
Matthew Rowlings, Andy Tyrrell and Martin Albrecht Trefzer, University of York, GB
Large-scale distributed computing architectures such as, e.g. systems on chip or many-core devices, offer ad- vantages over monolithic or centralised single-core systems in terms of speed, power/thermal performance and fault tolerance. However, these are not implicit properties of such systems and runtime management at software or hardware level is required to unlock these features. Biological systems naturally present such properties and are also adaptive and scalable. To consider how these can be similarly achieved in hardware may be beneficial. We present Social Insect behaviours as a suitable model for enabling autonomous runtime management (RTM) in many-core architectures. The emergent properties sought to establish are self-organisation of task mapping and system- level fault tolerance. For example, large social insect colonies accomplish a wide range of tasks to build and maintain the colony. Many thousands of individuals, each possessing relatively little intelligence, contribute without any centralised control. Hence, it would seem that social insects have evolved a scalable approach to task allocation, load balancing and robustness that can be applied to large many-core computing systems. Based on this, a self-optimising and adaptive, yet fundamentally scalable, design approach for many-core systems based on the emergent behaviours of social-insect colonies are developed. Experiments capture decision-making processes of each colony member to exhibit such high-level behaviours and embed these decision engines within the routers of the many-core system.
|14:20||11.8.2||OPTIMISING RESOURCE MANAGEMENT FOR EMBEDDED MACHINE LEARNING
Lei Xun, Long Tran-Thanh, Bashir Al-Hashimi and Geoff Merrett, University of Southampton, GB
|14:40||11.8.3||EMERGENT CONTROL OF MPSOC OPERATION BY A HIERARCHICAL SUPERVISOR / REINFORCEMENT LEARNING APPROACH
Florian Maurer, TUM, DE
Florian Maurer1, Andreas Herkersdorf1, Bryan Donyanavard2, Amir M. Rahmani2 and Nikil Dutt3
1TUM, DE; 2University of California, Irvine, US; 3University of California, US
MPSoCs increasingly depend on adaptive resource management strategies at runtime for efficient utilization of resources when executing complex application workloads. In particular, conflicting demands for adequate computation perfor- mance and power-/energy-efficiency constraints make desired ap- plication goals hard to achieve. We present a hierarchical, cross- layer hardware/software resource manager capable of adapting to changing workloads and system dynamics with zero initial knowledge. The manager uses rule-based reinforcement learning classifier tables (LCTs) with an archive-based backup policy as leaf controllers. The LCTs directly manipulate and enforce MPSoC building block operation parameters in order to explore and optimize potentially conflicting system requirements (e.g., meeting a performance target while staying within the power constraint). A supervisor translates system requirements and application goals into per-LCT objective functions (e.g., core instructions-per-second (IPS). Thus, the supervisor manages the possibly emergent behavior of the low-level LCT controllers in response to 1) switching between operation strategies (e.g., maximize performance vs. minimize power; and 2) changing application requirements. This hierarchical manager leverages the dual benefits of a software supervisor (enabling flexibility), together with hardware learners (allowing quick and efficient optimization). Experiments on an FPGA prototype confirmed the ability of our approach to identify optimized MPSoC oper- ation parameters at runtime while strictly obeying given power constraints.
|15:00||11.8.4||ASTROBYTE: A MULTI-FPGA ARCHITECTURE FOR ACCELERATED SIMULATIONS OF SPIKING ASTROCYTE NEURAL NETWORKS
Shvan Karim, Ulster University, GB
Shvan Haji Karim, Jim Harkin, McDaid Liam, Gardiner Bryan and Junxiu Liu, Ulster University, GB
Spiking astrocyte neural networks (SANN) are a new computational paradigm that exhibit enhanced self-adapting and reliability properties. The inclusion of astrocyte behaviour increases the computational load and critically the number of connections, where each astrocyte typically communicates with up to 9 neurons (and their associated synapses) with feedback pathways from each neuron to the astrocyte. Each astrocyte cell also communicates with its neighbouring cell resulting in a significant interconnect density. The substantial level of parallelisms in SANNs lends itself to acceleration in hardware, however, the challenge in accelerating simulations of SANNs firmly resides in scalable interconnect and the ability to inject and retrieve data from the hardware. This paper presents a novel multi-FPGA acceleration architecture, AstroByte, for the speedup of SANNs. AstroByte explores Networks-on-Chip (NoC) routing mechanisms to address the challenge of communicating both spike event (neuron data) and numeric (astrocyte data) across significant interconnect pathways between astrocytes and neurons. AstroByte also exploits the NoC interconnect to inject data and retrieve runtime data from the accelerated SANN simulations. Results show that AstroByte can simulate SANN applications with speedup factors of between x162 -x188 over Matlab equivalent simulations.
|15:30||End of session|