11.7 Power and Emerging Technologies in Reconfigurable Computing

Printer-friendly version PDF version

Date: Thursday 27 March 2014
Time: 14:00 - 15:30
Location / Room: Konferenz 5

Chair:
Diana Goehringer, Ruhr-University Bochum (RUB), DE

Co-Chair:
Fabrizio Ferrandi, Politecnico di Milano, IT

The first two papers in this session propose new architectures that take advantage of emerging nonvolatile memory technologies. The third paper proposes a battery cell aware task partitioning and mapping to maximize battery runtime.

TimeLabelPresentation Title
Authors
14:0011.7.1EXPLOITING STT-NV TECHNOLOGY FOR RECONFIGURABLE, HIGH PERFORMANCE, LOW POWER, AND LOW TEMPERATURE FUNCTIONAL UNIT DESIGN
Speakers:
Adarsh Reddy1, Hamid Mahmoodi2 and Houman Homayoun1
1George Mason University, US; 2San Francisco State University, US
Abstract
Unavailability of functional units and their unequal activity makes performance bottlenecks and thermal hot spot units in general-purpose processors. We propose to use reconfigurable functional units to overcome these challenges. A selected set of complex functional units that might be under-utilized, such as a multiplier and divider, are realized in a time multiplexed fashion using a shared programmable Look Up Table (LUT) based fabric. This allows for run-time reconfiguration and migration of their activity. LUT based implementation also allows under-utilized functional units to be dynamically reconfigured to the functional units that have a performance bottleneck and hence improving performance. The programmable LUTs are realized using Spin Transfer Torque (STT) Magnetic technology (also called STT-NV) due to its zero leakage and CMOS compatibility. The results show significant performance improvement of 16% on average across standard benchmarks, when replacing CMOS multiplier and divider with reconfigurable STT-NV LUT counterpart. In addition, reconfiguration reduces the maximum temperature of functional units by up to 27oC and almost eliminates the thermal variation across them. This comes with small power overhead and no area impact.
14:3011.7.2A POWER-EFFICIENT RECONFIGURABLE ARCHITECTURE USING PCM CONFIGURATION TECHNOLOGY
Speakers:
Ali Ahari1, Hossein Asadi1, Behnam Khaleghi1 and Mehdi Tahoori2
1Sharif University of Technology, IR; 2Karlsruhe Institute of Technology, DE
Abstract
Promising advantages offered by resistive Non-Volatile Memories (NVMs) have brought great attention to replace existing volatile memory technologies. While NVMs were primarily studied to be used in the memory hierarchy, they can also provide benefits in Field-Programmable Gate Arrays (FPGAs). One major limitation of employing NVMs in FPGAs is significant power and area overheads imposed by the Peripheral Circuitry (PC) of NVM configuration bits. In this paper, we investigate the applicability of different NVM technologies for configuration bits of FPGAs and propose a power-efficient reconfigurable architecture based on Phase Change Memory (PCM). The proposed PCM-based architecture has been evaluated using different technology nodes and it is compared to the SRAM-based FPGA architecture. Power and Power Delay Product (PDP) estimations of the proposed architecture show up to 37.7% and 35.7% improvements over SRAM-based FPGAs, respectively, with less than 3.2% performance overhead.
15:0011.7.3EXTENDING LIFETIME OF BATTERY-POWERED COARSE-GRAINED RECONfiGURABLE COMPUTING PLATFORMS
Speakers:
Shouyi Yin1, Peng Ouyang1, Leibo Liu2 and Shaojun Wei1
1Tsinghua University, CN; 2Institute of Microelectronics and The National Lab for Information Science and Technology, Tsinghua University, CN
Abstract
The coarse-grained reconfigurable architecture (CGRA) is a promising platform for mobile computing. In this pa- per, how to prolong the lifetime of battery-powered reconfigurable computing platform is addressed. Considering the nonlinear characteristics of battery, a multi-objective optimization model is built for extending the lifetime of battery. Based on this model, a joint task-battery scheduling algorithm is proposed. The experimental results show that the proposed method achieves 26.22% improvement on battery runtime averagely comparing to the state-of-the-art methods.
15:30IP5-20, 6593D FPGA USING HIGH-DENSITY INTERCONNECT MONOLITHIC INTEGRATION
Speakers:
Ogun Turkyilmaz1, Gerald Cibrario2, Olivier Rozeau2, Perrine Batude2 and Fabien Clermidy3
1CEA-LETI, Minatec Campus, FR; 2CEA, FR; 3CEA-LETI, FR
Abstract
New 3D technology, called "Monolithic Integration", offers very dense 3D interconnect capabilities. In this paper, we propose a 3D FPGA architecture with logic-on-memory approach based on this technology. The routing and computation blocks are splitted into two layers where the logic is placed on the top and memory on the bottom. Using extracted values from layout in 14nm FDSOI technology, typical benchmark circuits are evaluated in the VPR5 toolflow. The results show an area reduction of 55% compared to the 2D FPGA. More importantly, due to the lowered routing congestion, the EDP of the 3D FPGA is improved by 47%.
15:32IP5-21, 526JOINT COMMUNICATION SCHEDULING AND INTERCONNECT SYNTHESIS FOR FPGA-BASED MANY-CORE SYSTEMS
Speakers:
Alessandro Cilardo, Edoardo Fusella, Luca Gallo and Antonino Mazzeo, University of Naples Federico II, IT
Abstract
This work proposes an automated methodology for optimizing FPGA-based many-core interconnect architectures. Based on the application communication requirements, the methodology concurrently defines the structure of the interconnect and the communication task scheduling, taking into account possible dependencies between tasks under given area constraints. The resulting architecture improves the level of communication parallelism that can be exploited while keeping area costs low. The paper thoroughly describes the proposed approach and discusses a few case-studies showing the impact of the proposed technique.
15:33IP5-22, 688A NOVEL EMBEDDED SYSTEM FOR VISION TRACKING
Speakers:
Antonis Nikitakis1, Theofilos Paganos1 and Ioannis Papaefstathiou2
1Technical University of Crete, Department of Electronic and Computer Engineering Kounoupidiana, Chania, Crete, GR73100, Greece, GR; 2Synelixis Solutions Ltd, Farmakidou 10,Chalkida, GR34100, Greece, GR
Abstract
One of the most important challenges in the field of Computer Vision is the implementation of low-power embedded systems that will execute very accurate, yet real-time, algorithms. In the visual tracking sector one of the most promising approaches is the recently introduced OpenTLD algorithm which uses a random forest classification method. While it is very robust, it cannot be efficiently parallelized in its native form as its memory access pattern has certain characteristics that make it hard to take advantage of the conventional memory hierarchies. In this paper, we present a novel embedded system implementing this algorithm. We accelerate the bottleneck of the algorithm by designing and implementing a high bandwidth distributed memory sub-system which is independent of the various software parameters. We demonstrate the applicability and efficiency of this novel approach by implementing our scheme in a modern FPGA.
15:30End of session
Coffee Break in Exhibition Area
On Tuesday-Thursday the coffee and lunch breaks will be located in the Exhibition Area (Terrace Level).